CivArchive
    Wan 2.2 Video + Sound workflow optimized for RTX 3060 12 GB VRAM GPU - v2.0
    NSFW

    [Edit:

    Version v5.0 works with latest comfyui (v0.15.0).

    If you have any problems, please refer to the FAQ at the bottom of the page or have a look in the comments.

    Many thanks to everyone who tested this workflow. Thank you very much for the many inquiries and, of course, for all the knowledge and experience you have contributed. here馃憤馃檪

    Special thanks to:

    @SeoulSeeker for the "Dead Simple MMAudio" workflow wich are the basis of the audio part here,

    @taek75799 for the really well working enhanced models

    @Bakazaya pointing to the color issue in version v3.0 and running lots of tests,

    @bluntfeather sharing latest experiances with installing Comfyui-Easy-Install,

    @nitrovtx for remain persistent in matters of quality and running a lot of tests,

    @Icey64 for providing the link to "Comfyui-Easy Install",

    @boinobin730 for asking for a First to Last Frame option, running pre tests and responding fast as hell 馃檪 and

    @SnowShoes311 thank you so much again for all your buzzing 馃構]

    Features:

    • Optimized Wan 2.2 workflow, runs perfect on RTX 3060 12 GB VRAM GPU and 32 GB RAM,

    • "Text to Video", "Image to Video" and "First/Last Frame 2 Video" generation in one workflow, all with easy audio generation,

    • easy installation/model downloading, all necessary sources are specified,

    • easy to use workflow, clearly structured, all necessary steps are explained,

    • easy switches for mode selection,

    • easy prompt selection for fast prompt creation/testing,

    • easy switching between "standard" and "enhanced" models,

    • very fast and smoth high quality outputs up to aprox. 1440 x 960 with 60fps,

    • 2x fast upscaler,

    • 4x fast framerate multiplier,

    • MMAudio Sampler (generates sound accordingly to the video action),

    • Triton and Sage Attention option,

    • A 5 Second long high quality video generation takes about 10 - 15 minutes (see below).

    Tested generation times:

    As a rough guide value for RTX 3060 GPU: generating a 5 second long high quality 1440 x 960 60 fps video with 6 steps it will take:

    • t2v: around 10 - 12 minutes,

    • i2v: around 15 minutes.

    Comfyui-Easy-Install with Triton + SageAttention:

    This workflow should work with any latest comfyui version >v0.6.0 (Desktop, Embedded, Windows/Linux).

    However, comfyui is developing rapidly, and it often happens that some of the custom nodes used are not updated quickly enough or not updated at all. Manual workarounds are sometimes necessary. Furthermore, care must be taken to ensure that there are no conflicts with other nodes.

    If you're having difficulties with your existing comfyui system or if you want to run video generation on a separate (parallel) comfyui system, like I do, I would recommend you the following installer: https://github.com/Tavris1/ComfyUI-Easy-Install.

    • Complete installation of comfyui including manager and some pre configured custom nodes is just one click - really 馃檪

    • Installation of Triton + SageAttention is just a second click - really 馃檪 And since it's so easy now, I would definitely recommend it to you for video generation.

    • Cause it is an embedded version, you can install it parallel to your existing comfyui version without the risk to ruin your working system.

    • After installation just configure the "extra_model_paths.yaml" file to use your existing models.

    • After a fresh installation of Comfyui-Easy-Install you might have some issues too, but there are known workarounds - please see the FAQ below.

    For testing/understanding/experimenting/changing the workflow:

    • Click "Toggle Link Visibility" to see the links.

    • click the Subgraph symbols to open the Subgraphs.

    • for quick testing you may lower the settings for: steps, clip lenght and video resolution,

    • be really carefull with modifying Groups or Subgroups (even Titel or Color) cause they are essential for switching,

    • feel free to try and test other models. Just give me a hint if you find models which deliver better results and fitting the 12 GB VRAM limit.

    And as usual: Have Fun 馃檪馃檪

    Short Conclusion:

    This workflow is based on elements of a variety of allready published workflows. My "job" was only to put things together, optimize it for a small machine and create a most simple and hopfully user or even "beginner" friendly workflow.

    I`m not an "expert" - just a user who wants to get it running on "available" hardware.

    There are many things I don't really understand. If you find mistakes or better solutions please give me a hint.

    And I really hope that even "beginners" have a chance to go the first steps...

    Frequently Asked Questions (FAQ):

    For quick and better overview I will try to merge all known issues here - step by step (please be patiant). If your issue is not listed here, please have a look in the comments first. Most issues have been allready discussed.

    Comfyui Nodes 2.0:

    Turn off Nodes 2.0 in comfyui (use comfyui menue). Actually not all custom nodes are supported.

    Comfyui crashes after generation while vae decode, upscaling or frame rate multiplying (Rife VFI) without any error report:

    This is a RAM problem (not VRAM). Increase your swap file (min. 64 to 128 GB) or set it to automatic management on a fast drive with at least 100 GB free space.

    JW Nodes (JWFloatToInteger, JWIntergerDiv, JWImageResizeByLongerSide), soundfile missing:

    For the workaround look here and here:

    python -m pip install soundfile

    Fresh Comfyui-Easy_Install Installation (missing soundfile and Pytorch v2.9.0 issue with SageAttention on Windows:

    For full conversation look here.

    Open cmd in python_embedded folder:

    python -m pip install soundfile 
    python -m pip uninstall -y torch torchvision torchaudio
    python -m pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu126

    Slider Nodes - how can I modify the "default" values:

    Right click the slider node, choose Properties and set the values you like 馃檪馃檭

    Description

    First Frame to Last Frame Video Generation option added,

    directly clickable download links for all models and file structure overview added,

    completely reorganised design for easy using of all options.

    FAQ

    Comments (37)

    boinobin730Sep 25, 20252 reactions
    CivitAI

    The version 2.0 workflow is extremely versatile. Having a start and end frame image allows for more fluid motions. Thank you for enhancing the 1.3 workflow.

    jonk999Sep 26, 20251 reaction
    CivitAI

    Have been using v1.1 for a while.
    I tried v2.0 and I got an error on the additional Lora nodes. I needed to connect the Load Clip to clip on high and low additional Loras.
    Also just wondering why on v2.2 KSampler Low has seed of 0 and fixed whereas in v1.1 it was randomised.
    And finally, have you found Beta scheduler to be better than Simple that was used in v1.1?

    arkinson
    Author
    Sep 26, 2025

    edit: please look here: https://civitai.com/models/1852904/wan-22-workflow-optimized-for-rtx-3060-12-gb-vram-gpu?dialog=commentThread&commentId=955554

    @jonk999 Clip connection to additional Lora nodes: Interesting - did you got a error message and the workflow stopped - or just a error message in the logs? I can`t reproduce it on my side. Wan generally needs no loaders with clip, but I like the "Power Loader" nodes for their usebillity. Anyway, to set the connection is definately not wrong.

    Seed 0 of second KSampler: It dosn`t matter, cause "add noise" is disabled馃槈

    Beta sheduler: The "official" workflows from the templates using euler + normal. For myself I did some quick test with lcm + normal and lcm + beta. I would say there is no big difference and finally I was satisfied with lcm + beta. If you have the capacity to do some more serious testing it would be really cool if you could provide your results.

    jonk999Sep 29, 2025

    @arkinson聽Thanks for the reply. I'll also look in the linked thread. Though I did find results better using Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors (weights: high = 3.0, low = 1.5)

    Regarding clip connection, the workflow stopped from memory, so I just looked at your previous workflow and connected them and then no further issues.

    Thanks for the clarification on seed 0 of second sampler.

    I'll have a play with scheduler/sampler when I have a chance.

    arkinson
    Author
    Sep 29, 20251 reaction

    @jonk999聽Hi - I`m glad you have it running 馃檪

    Yes, the old Wan 2.1 Lora works well with T2V and I2V and I believe it will suit for most users. But of course, you can also try the very new Wan 2.2 Loras linked by boinobin730 in the above "edit link". Just the "official" Wan 2.2 Loras I have linked in my version v2.1 leading to extreme slow motion.

    The issue with the clip connection is really strange - on my machine it runs even without the connections 馃檮But I have fixed it in version v2.1/v2.2.

    Sheduler/Sampler settings: I believe it is really a matter of taste and results are very randomly.

    mv_iaSep 26, 20251 reaction
    CivitAI

    Hi, great workflow! I have a question: if I increase the "Clip Length (in seconds)" setting, for example from 3 to 9 seconds, the video plays in slow motion. How can I fix this? Do I need to change any other parameter in the workflow?

    arkinson
    Author
    Sep 26, 20251 reaction

    edit: please look here: https://civitai.com/models/1852904/wan-22-workflow-optimized-for-rtx-3060-12-gb-vram-gpu?dialog=commentThread&commentId=955554

    @mv_ia Hi - thank you 馃檪Slow motion seems to be a known problem caused by the lightx2v Loras. In preperation of this workflow I had tried to fix it by a 3 stage workflow (first KSampler runs without the Lora), but unfortunately the results where not satisfying and it could leed to other problems ....

    Short anwer: try stronger prompting, additional Loras and play with generation time. 5 second clips mostly working well. With "longer" clips you could run in other problems too, like repeating effects....

    arkinson
    Author
    Sep 26, 20251 reaction
    CivitAI

    [edit: version 2.1 is out now馃檪]

    @mv_ia @jonk999 @boinobin730 @hdean @SnowShoes311 Hi guys - would like to ask you for some help with version v2.0 for some futher testing 馃檪 cause I got first comments with some issues:

    1. known "bug": missing clip connections to "Additional Loras" (ok, fixing is no problem).

    2. Slow motion: T2V really seems to produce mostly slow motion. I swiched back to the old Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors (weights: high = 3.0, low = 1.5) even for T2V generation and it seems to work much better 馃檮

    3. KSampler: I used lcm + beta in v2.0, but quick testing with the "official" euler + simple seems really give better outputs.

    Would be really great if you could run some tests with the wan 2.1 Lora and the KSampler - if possible for both T2V and I2V - and report here.

    boinobin730Sep 27, 20251 reaction

    @arkinson I did some testing. The standard T2V Lightx2V is very slow-mo even at normal weights of H=1.0 and L=1.0 as well as at H=3.0 and L=1.5. Not usable at all. Trying the other Lightning LORAs I got slightly better results with the I2V but there are a few errors in generation at times. Fast hand movements are a struggle for all the lightning loras. The biggest factor I found is that there seems to be link between picture quality and slow mo. Higher quality clips will be slower than faster clips which look very degraded. I guess this is the tradeoff. BUT In my little investigation I found a very good Lightning model. It is Kijai's version.

    https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Wan22-Lightning

    The 2.2 I2V and T2V gives a very decent output that is not too slow and retains good motion and picture reproduction.

    Someone else did a study on lightning LORA models as well.

    https://www.reddit.com/r/comfyui/comments/1msx81f/visual_comparison_of_7_lightning_models_in_320_x

    if you want to do an even deeper dive and find the holy grail of lightning LORA models.

    I will upload some of my results but I'm going to keep testing the Kijai lightning versions myself.

    boinobin730Sep 27, 20251 reaction

    @arkinson I haven't played with the ksampler setting yet. I might have play in the next few days.

    arkinson
    Author
    Sep 27, 20251 reaction

    @boinobin730 Wow, thank you for your fast reply and the links 馃檪 I will dive into the Lightning Lora part now. For KSampler: The difference only seems to be noticeable with T2V.

    arkinson
    Author
    Sep 27, 2025

    @boinobin730 Me again 馃槈This stuff drives me crazy. The more tests I run, the more confused I become 馃檮and your above link increases that even more... 馃槀

    I have done a couple of tests with T2V only (all under exactly same conditions):

    standard wan2.2_t2v_lightx2v_4steps_lora_v1.1_high/low_noise.safetensors: slow motion, but sometimes best quality.

    Kijai Wan2.2-Lightning_T2V-v1.1-A14B-4steps-lora_HIGH/Low_fp16.safetensors: exteme slow motion.

    previous Wan 2.1: Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64: normal/good motion, mostly acceptable good quality (even this is a I2V Lora).

    The crazy thing: comparison between "euler + simple" and "lcm + beta": both delivers good quality but completely different output scenes 馃檮

    Ok - to keep it simple as possible I will publish version v2.1 with "euler + simple" and the "old" Wan 2.1 Lora and some short hints for other options. I hope this will primary cover the most use cases for T2V and I2V with just a single lightning Lora.

    arkinson
    Author
    Sep 27, 2025

    @boinobin730 Version v2.1 is out now. Thank you again for your inputs 馃憤

    boinobin730Sep 27, 2025

    @arkinson聽That is strange. I need to test more too. I only did one run of each. It is so hit and miss and each test takes 18-20+ minutes. I mostly do I2V as I have limited use case for T2V. I will test 2.1. Thanks for your efforts.

    arkinson
    Author
    Sep 27, 2025

    @boinobin730聽v2.1 Do you have activated the Triton + Sage Attention option? I have set the standard setting to OFF, cause I was afraid to get a lot of questions from users without it installed "Why get I errors...."

    boinobin730Sep 28, 2025

    @arkinson聽Yes. Sage Attention is on. I generate as close to maximum resolution as I can for about 5 second video clip. Steps left at 6.

    GREAT NEWS. In the last 3 hours, they have released yet another T2V Lightning LORA model. This is what the WAN2.2 was supposed to be like. I have tested it. If you generate at 6 steps it no longer gives you slow mo. It is even much sharper and better quality than the 2.1 Lightning. It also doesn't have as many weird glitches going on.

    Pick it up here.

    https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928

    rename it to something meaningful.

    set strength to High=1, Low=1.

    I set model shift to 5. Euler A, CFG 1. I generated in 4 steps but I wasn't too happy with it. changed it to 6 steps. It was better. Not slow mo at all.

    It is really good. I2V coming soon.

    arkinson
    Author
    Sep 28, 2025

    @boinobin730聽Hey mate, these Loras are just 2 hours out and you are here 馃槀馃ぃI actually wanted to continue at a leisurely pace, but now I`m on the run again 馃檮 Thank you! 馃檪

    arkinson
    Author
    Sep 28, 2025

    @boinobin730聽Oh my - my machine is much to slow for serious testing 馃檮

    I did a few couple of test runs under defined conditions allready. You are right, the brand new Lora from your link for T2V works like a charme. Quick movements (normal speed) and brilliant bright colours. Visually I would say the "image" quallity is the same or just a tick better like the old wan 2.1 lora. Personally I woull tend to prefer the "style" of the old Lora. But I didn`t run any consistency tests with additional character Loras so far.

    model shift 8 or 5: I would say same quality but slightly different scenes.

    Something interesting too: with the old wan 2.1 Lora I accidentally did some runs with weights = 1.0 for high/low pass: the outputs looks not bad and convey a more realistic "vintage movie" style. Even the outputs with lcm + beta looking well....

    But I am afraid, that is all not rocket science but more a matter of taste 馃檭

    In order to avoid causing further confusion, I would not release a new version at this time. After further testing, I will simply provide a reference to the new Lora links in the near future.

    Btw. some users really have a lot of problems to get the basics running....

    boinobin730Sep 28, 2025

    @arkinson聽all good. I was letting you know just as a FYI for you. I don't expect your workflow will change as it is already excellent. I'm just amazed at how quick things are developing. Things change so much. It's hard to keep up. No wonder people are having trouble. I'm a reddit fan, so I get informed of most news on r/stable diffusion and r/comfyui on reddit.

    Wan 2.1 was never good for me... maybe I did something wrong or used a wrong version. I will have a little look later.

    arkinson
    Author
    Sep 28, 20251 reaction

    @boinobin730聽I allways used this one (for T2V and I2V) lightx2v/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v at main with weights 3,0 high and 1.5 low.

    boinobin730Sep 28, 2025

    @arkinson聽Thank you. On checking, Yes I have used it before. But my results were nothing spectacular. It's strange. I will play around with all of them.

    thegrotfarmer353Sep 26, 2025
    CivitAI

    Hi! Workflow has been producing some very excellent results. However, I keep getting:
    "lora key not loaded: diffusion_model.blocks.0.cross_attn.k_img.lora_up.weight

    lora key not loaded: diffusion_model.blocks.0.cross_attn.norm_k_img.diff

    lora key not loaded: diffusion_model.blocks.0.cross_attn.v_img.diff_b

    lora key not loaded: diffusion_model.blocks.0.cross_attn.v_img.lora_down.weight

    lora key not loaded: diffusion_model.blocks.0.cross_attn.v_img.lora_up.weight

    lora key not loaded: diffusion_model.blocks.1.cross_attn.k_img.diff_b"

    0 to 39. and then again 0 to 9. It still produces the video, albeit much slower than it's supposed to. The internets claim this is an incompatibility with a LORA and a checkpoint.

    Any help is much appreciated. ;)

    arkinson
    Author
    Sep 26, 2025

    Hi -thank you. Do you use Wan 2.1 Loras? Please see my note in the workflow "Wan 2.1 Loras"

    thegrotfarmer353Sep 26, 2025

    @arkinson聽it just seems a bit bizarre, as I'm only using things from the workflow. I'll try to figure it out.

    arkinson
    Author
    Sep 27, 2025

    @thegrotfarmer353聽My question was "serious" 馃檮cause if you use any Wan 2.1 Lora these error messages are normal and you can ignore it. So just to be sure - do you use Wan 2.1 Loras?? And wich workflow version do you use?

    thegrotfarmer353Sep 27, 2025

    @arkinson聽Seriously? :) I'm using only things from the 2.0 workflow. So you're saying just ignore the error message; no biggie?

    arkinson
    Author
    Sep 27, 20251 reaction

    @thegrotfarmer353聽Oh my - you really don`t like to answer at my question 馃槀Please look in the workflow under additional Loras. If you have the "bouncing boobs" Loras activated, than you allready use Wan 2.1 Loras 馃槈馃檪Without a concrete anwer on a concrete question I say nothing 馃檭

    kilplix107Sep 26, 20253 reactions
    CivitAI

    I just wanna say, I've spent like 3 damned days trying to get 2.2 to work on my 3080 and this workflow worked with no issues. THANK YOU

    arkinson
    Author
    Sep 27, 2025

    Hi - thank you so much 馃憤Please stay tuned, cause I just try to fix some issues with version v2.0. Hopfully I can "release" v2.1 soon 馃檪

    bowdo666Sep 27, 20251 reaction
    CivitAI

    Amazing workflow, thank you. Running on a 7900XTX.

    arkinson
    Author
    Sep 27, 2025

    Thank you 馃檪Have a look at version v2.1 now, for bypassing the slow motion problem.

    hydragyrum2Sep 27, 20251 reaction
    CivitAI

    Love the workflow! Been getting consistently great results from it. Will be upgrading to a 5090 next week and I want to continue using this workflow but want to change the number of steps. I'm not sure how to change the steps in your workflow to something above 8 steps. Confused by the integer node. Any advice would be amazing! Thank you!

    arkinson
    Author
    Sep 27, 20251 reaction

    Hi - thank you so much 馃檪It`s easy: Right click on the steps node, go to "Properties", set "max. value". You can test higher values, but with the Lightning Lora I would believe values around 6 - 8 would be optimal. I have no experiances with the 5090, but with a very fast gpu you might have better options: turn off the Lora and try around 20 steps or try better models like the Q8, or even try other workflow too ....

    By the way, my version v2.1 is out now馃檪

    ValomarSep 27, 20252 reactions

    聽I upgraded to a 5090 and think that using the lightning Lora is still acceptable. You can use without, but for me the results are acceptable. I agree with Arkinson, the 'lightx2v' Lora produces better results than the 2.2 lightning Loras.

    Don't do like me and continue to use GGUF versions. Use the FP8 models, they load faster and produce better results.

    You can look at some of my posted videos, my workflow is embedded there

    arkinson
    Author
    Sep 28, 2025

    @Valomar聽Thank you for your feedback and your hint to the GGUF Models馃憤Yesterday I started some direct comparison test with the 14B_fp8_scaled models on my RTX 3060 and the results are sobering:

    - model loading needs a lot more than 12 GB VRAM. It works, but it needs time to load.

    - quality: visually I can`t see any relevant differences on the highest resolution you can run with an RTX 3060.

    If you use any thing better than a 3060 you are definitely right, but in this case you might have a lot of other options of course 馃檪

    ValomarSep 28, 2025

    @arkinson聽 Yeah, agree that for a 3060 or a 4070 even (which is what I was using before upgrading) the GGUFs are the route to go. However, @hydragyrum2 stated that they were going to upgrade to a 5090. For the first few days, I continued to use the GGUF Q6 or Q8 and was shocked by how slow everything still was. Then I tried the fp8 and was like "Oh, duh, I don't need to decompress the GGUFs, which takes time". So, was just recommending that they don't repeat my mistake.

    The other thing @hydragyrum2 I would recommend, if you are able, is to upgrade your RAM to at least 64 GB (or more, if you can) so that you can hold bot the High and Low models in RAM. I had only 32GB and that was bottlenecking me. Upgraded and it's been much better.

    arkinson
    Author
    Sep 28, 2025

    @Valomar聽Sorry, got you completely wrong - but testing the limits wasn`t bad too 馃槄 Upgrading to 64 GB RAM sounds good.

    Workflows
    Other

    Details

    Downloads
    386
    Platform
    CivitAI
    Platform Status
    Available
    Created
    9/25/2025
    Updated
    6/30/2026
    Deleted
    -

    Files

    wan22VideoSoundWorkflow_v20.zip

    Mirrors

    HuggingFace (1 mirrors)