Wan2.1 14B i2v (Native|GGUF) Self-Forcing(lightx2v) Single/Dual Sampling - 🟥v2.0 (Expanded) Old

NSFW

Go here for Wan2.2 WF ≈➤🍥 Wan 2.2 (GGUF) [i2v / FFLF] + [t2v] Workflow

Thanks to @definitelynotadog for Dual KSampler Worflow.

Thanks to @Ada321 for uploading Self-Forcing(Lightx2v) Lora here on CivitAI.

*25/7 Updated 🟨 Dual Sampler Workflow to V3 (Updated this in case anyone wants to try, its slower and slightly more complex - nothing change for the base structure)

*24/7 Added 🟩 Single Sampler Workflow V3

Added Video Preview for all WF
Added Post Processing Section for Expanded & Compacted WF
Shifted some inputs to Post Processing Section for Expanded & Compacted WF
Seed node in Expanded & Compacted WF is set to "New Fixed Random" by default.
Swap VRAM-Clean Up to Easy-Use Custom Node Clean VRAM
Added Icons, adjusted some Visuals and edited+added some Notes
Minor visual adjustment in Interpolation+Upscaler WF

Post Processing Section (Expanded & Compacted):

When using "New Fixed Random" in the seed node, after a video is generated, we can change/edit inputs or choose another selections of option and generate again while skipping Sampling process/steps.

Example: Let's say you generated a video by clicking on the ComfyUI RUN button and you forgot to set the Interpolation Multiplier or choose Interpolate + Upscale Options. Once the video have finished generating, you can change and set the Interpolation Multiplier and/or select Interpolate + Upscale option and click ComfyUI RUN button again which will skip the Sampling process/steps because you are using the same seed number and Not change any inputs that is not in the Post Processing Section.

To generate a new seeded video, click on "New Fixed Random" in the seed node and click on the ComfyUI RUN button.

This method also speed up drafting for videos, so you don't waste time interpolating or upscale the video you might not want.

Small written guide in WF.

17/7 - Added 🟩 Single Sampler Workflows V2.1 to use with the new Self-Forcing(Lightx2v) Lora.

Single Sampler Downloads Contains the following WF:

Expanded (Shows all connections - mainly for learning + exploration)
Compacted (Only show essentials and hides everything else)
Simplified ("Standardized" WF for those who prefer to have more control over inputs)
Interpolation + 2x Upscaler (Useful when you want to generate a lot of videos until you get the desire one and interpolate+upscale later. )
Joiner/Merger (Joins 2 videos together)

This Single Sampler workflow includes:

GGUF Loaders for Diffuser Model + Clip (Best to use GGUF Model + Clip together)
Block Swap for memory management
Sage Attention (Speed up Generation Time) - I forgot how I managed to install this.
Torch Compile (Speed up Generation Time) - Enable only when your system support it.
NAG (Normalised Attention Guidance) - Adjusts negative prompt influence intelligently when using CFG 1.
Stack Lora Loader
Scale by Width for Image Dimension Adjustment
Video Speed Control
Frame Interpolation for Smoothing
Auto Calculation for Numbers of Frames and Frame Rates in accordance to the inputs of video length, speed and interpolation multiplier
Save Last Frame (For sequencing video)
Color Match (Useful for sequencing video or uniform color through out video)
VRAM - Clean-up
Upscaler (up to 2x)

5min 30sec generation time on my 3090 Ti 24 GB:

720x960 Image - 4 steps - 81 frames (5secs) - 4x Interpolation - Upscaler - GGUF - Torch Compile - Sage Attention

Videos posted above are without speed adjustment.

Includes Embedded Workflow. (Download the vid, drag into ComyfUI)

Links to models/lora files in workflow.

Always download the files from this page as there might be minor updates.

(use Videos posted for settings examples)

I only tested with a few loras and they seem good. Try to use alternative Loras if those that are not working to your desire, and you can always fall back on Dual Sampler if needed. (They are still functional but takes longer time than single). Can also try to lower Self-Forcing(lightx2v) strength.

🟨 Dual Samplers Section:

*V2.1 Minor Update.

Switch VAE Decode (Tiled) to the normal VAE Decode, which is causing "sudden flashing" when video are more than 5 seconds or 81 frames.
Updated CausVid v2 link in all WF or get it Here. (thx @01hessiangranola851)

*If you already using the WF:

experience flashing/sudden brightness when generating more than 5 sec vid. Try setting the temporal_size to 64 in the VAE Decode (tiled) or switch to the normal "untiled" VAE decode.
experiencing grey out first few frames, reduce the CausVid Strength or just set it to 0.3.

*V2 Update

GGUF Loaders for GGUF Diffusal Model & Clip with download links. (Choose either Native or GGUF. Disable/delete the ones not using.)
"Fixed" Torch Compile and add fp16 accumulation options.
Color Match (Useful for sequencing video or uniform color through out video) .
Template for External Video Merger/Joiner. (in expanded version)
CausVid Strength ( Range: 0.3 - 1 )
Minor visual and notes adjustments.

The i2v workflow is build with the intentions in mind:

Build for the use of Self-Forcing(lightx2v) Lora (but not limited to)
For learning and exploration
Experimental purposes
Modular Sections (add, build upon, swap or extract part of the workflow)
Exploded View - to see all connections (In expanded version)

This Dual Sampler workflow includes:

Block Swap for memory management
Sage Attention (switch on if you have it installed)
Torch Compile
Stack Lora Loader
Scale by Width for Image Dimension Adjustment
Video Speed Control
Frame Interpolation for Smoothing
Auto Calculation for Numbers of Frames and Frame Rates in accordance to the inputs of video length, speed and interpolation multiplier
Save Last Frame
Previews for both First KSampler Latent and End Video Images
Dual Sampling using 2 KSamplers
VRAM - Clean-up
Upscaler (up to 2x)
Template for external Frame Interpolation (in expanded version).

Models to use:

The workflow can use either wan2.1 14B 480p or 720p. (i2v).

720p model and higher resolution images are recommended as it gives better quality. Especially eyes and teeth during motions.

Examples:

5 first steps / 3 last step / 81 frames

480p model - 480x640 Image

480p model - 720x960 Image

720p model - 480x640 Image

720p model - 720x960 Image

Generation speed with my 3090 TI 24GB with Sage Attention no GGUF - 5 first steps & 3 last steps (total 8 steps) , 81 frames and 4x frame interpolation multiplier:

720 x 960 Image : ~750 secs 12-13 mins est.
480 x 640 Image : ~350 secs 5-6 mins est.

Note:

Some Loras may distort faces or the character. Either reduce the lora strength or use an alternative lora.
Sometimes you may need to generate a few times to have better motion seed. Be patient.
I did not test every loras, so you will need to test and figure it out yourself.
(480p/720p Model, Image Dimension, Lora Strength, Start CFG)
*If you find the other loras you are using with this workflow are too aggressive. (too much motion, color change, sudden exposure), lower down the Start CFG. Alternate between 3 & 5 to see which is better.
Some videos I posted above used lower CFG because the other loras are too aggressive with high CFG level.

Drafting for motion with other Loras:

Use smaller image dimension for faster generation to see if the lora you use have any motion.
Once satisfied with lora and prompt proceed to use desire image dimension.

Other tips:

You can also clean up distortion/blur by using other V2V workflow.

Like this one:

https://civarchive.com/models/1714513/video-upscale-or-enhancer-using-wan-fusionx-ingredients

and/or use face swap to clear face distortions.

🟨Dual KSampler

Recommended Steps:

5 start steps / 3 end steps (I used the most for testing)
4 start steps / 3 end steps

The old T2V Self-Forcing(lightx2v) may sometimes hinder, slow down or produce less motion for some loras.

In order to get more motion for some loras, you need a higher CFG level of more than 1, but when using Self-Forcing(lightx2v), you need to set the CFG to 1.

So this is when Dual KSampler is utilize.

The 1st KSampler uses high CFG level of 3-5 to create better "motion latent" along with Causvid Lora to increase more motion with lesser step generation.
The 2nd KSampler uses low CFG of 1 for finishing the video with 3 steps using Self-Forcing(lightx2v) lora for speed generation. More steps count will make Self-Forcing(lightx2v) lora to influence the video more and cause lesser motion again.

In order to pass "motion latent info" to the 2nd sampler, the 1st step count has to be more than half of the total steps.

Examples:

5 first steps / 3 last steps / 8 total steps
4 first steps / 3 last steps / 7 total steps

When it is configure in this way, you can see the images start to form in the latent preview:

That is when it can be pass to the 2nd KSampler with "motion latent info" to finish it off without heavily influencing it in low steps.

(If the 1st steps count is half or less than half of the total steps, you will see a very noisy image that does not resemble anything.)

Basically a 7-8 steps generation splits into 2 KSampler.

The 2nd KSampler continue the generation process from where the 1st KSampler left off.

(Using 2 normal samplers will not produce the same results as the 2nd sampler will not know at which step to continue from. It will take the product of the 1st KSampler, ignore what it has produce and start from step 0.)

With the initial KSampler generating at 3-5 CFG, its slower. But the trade off here is to get more motion when using it with other loras. Comparing it to a 20-30 steps with no CausVid or Self-forcing(Lightx2v), it way faster.

Unfortunately, KSampler with Start End Step is only available for Native and not WanWrapper.

Tooooo many GET and SET nodes....

!!! Only available when ComfyUI-Easy-Use custom nodes are installed.

You can utilize the Nodes Map Search (Shift+m) function.

In your Comfyui Interface panel. Usually on the left. Look for an icon with 1 small square on top and 3 small squares below it. It's call Nodes Map.

Let say you see a "Set_FrameNum" node.

And you want to know where the "Get_FrameNum" is.

Enter in the search bar:

Get_FramN....

--! Case Sensitive !--

And you will see it filtered.

Double click on that and it will bring you to the node.

Likewise for Get nodes:

Example for "Set_FrameNum"

Search:

Get_FrameNum

--! Case Sensitive !--

Filtered.

Double click.

Custom Nodes

ComfyUI-Custom-Scripts
rgthree-comfy
ComfyUI-KJNodes
ComfyUI-Frame-Interpolation
ComfyUI-mxToolkit
ComfyUI-MemoryCleanup
ComfyUI-wanBlockswap
MediaMixer
ComfyUI-Easy-Use (Install manually in your Custom Node Manager)

After notes:

You may build upon, use part of, edit, merge and publish without crediting me.

~~The reason why I don't use GGUF is because it keeps bricking my ComgfyUI every time I tried installing it.~~

I do not have more in-depth level of understanding beyond this point.

Description

GGUF Loaders
"Fixed" Torch Compile
Color Match
External Video Merger/Joiner.
~~CausVid Strength to 0.7.~~
Minor visual and notes adjustment.

FAQ

Comments (40)

INFINIXARTJul 5, 2025· 3 reactions

CivitAI

seems like v2 got color matching fixed, and motions are much smoother, yes I made sure about that, but I'm suffering using Torch compile, much much slower after enabled it, and have to restart cimfyui and disable it to run normaly, no idea at all, thanks for updating v2!

houdh235914Jul 9, 2025

Same thing with Torch

6400850Jul 5, 2025· 1 reaction

CivitAI

What is causing my videos to be overcooked? im using the same exact models as is in the flow

Lannfield

Author

Jul 5, 2025

Lower down the Start CFG- 3, which Lora are you using with ?

cukurpapiks383Jul 10, 2025

have same issue, CFG 3.0 makes 90% of videos ok, but often creates slow motion. I can't create a good fast deepthroat video because of it.

maplebagJul 5, 2025· 1 reaction

CivitAI

this workflow is not making sense to me. I am running a 4090 and i am constantly getting OOM. I try to block swap, it gives error. Solution to error is to disable block swap. Get OOM because no block swap.

Can you help me wrap my head around this? Why am I getting this error? ("Expected all tensors to be on the same device")

why am i running out of memory with 24gb?

i'm running 480p 14b bf16. should i try fp16?

maplebagJul 5, 2025

okay i guess i actually need about 29gb to run the full model without block swap. So I guess my question now is, why is the block swap not functioning in my workflow? I've just updated to latest version of Comfy and nodes

maplebagJul 5, 2025

i'll just use the q08 gguf but i wonder why block swap isnt working

Lannfield

Author

Jul 5, 2025

@maplebag did u set to 40? Also, set the precision to default first and see.

Lannfield

Author

Jul 5, 2025

@maplebag i meant set blockswap to 40 and try, when i use 480p i used the fp8 scaled version

maplebagJul 5, 2025· 1 reaction

@Lannfield i'll try that, originally i was getting the error with swap set to 40. I think it's just an actual memory issue. i also intend to test with smaller ggufs to see

maplebagJul 5, 2025

yeah it just doesn't run. i don't have the knowledge to figure it out. it just won't load the model. gets stuck on WanImageToVideo. using q5_k_m model and q6_k encoder. what's frustrating is I was somehow able to generate a single video with the full model, but since then it just hangs. im waiting 15 minutes on one node which doesnt sound right

maplebagJul 5, 2025

i get this: "Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding." but it threw this error on my successful gen as well

maplebagJul 5, 2025

made it past load now hanging on first step

loaded partially 128.0 127.9998779296875 0

Warning: Ran out of memory when regular VAE encoding, retrying with tiled VAE encoding.

Requested to load WAN21

0 models unloaded.

loaded partially 128.0 127.998046875 0

Attempting to release mmap (721)

Patching comfy attention to use sageattn

0%| | 0/5 [00:00<?, ?it/s]

will try encoder q5_k_m as a last ditch attempt

Lannfield

Author

Jul 5, 2025

@maplebag I’m using q5 as well. Make sure you are using gguf clip. I think I ran into same problem, then I just use gguf model + gguf clip. Everything runs. Hope this helps.

maplebagJul 5, 2025

@Lannfield getting steps in with q5_k_m both, so you are probably right, however, I realized i changed settings on the resize node, which might have been unintentionally throwing a huuuuge latent into the sampler. i will do more testing but i think this was it. thank you for your suggestions.

maplebagJul 5, 2025· 1 reaction

yes okay i am 90% certain it was the latent.

thank you! I'm excited to use this workflow to experiment with generating videos over 81 frames. i've got it to work in the past but it absolutely killed movement so i thought this would be a good solution. i am having pretty good success so far!

ReGeneratedJul 6, 2025· 2 reactions

CivitAI

Really good! I don't use this workflow but have implemented the dual sampler using your settings, with causvid + lightx2v, it's a bit slower than lightx2v alone but the motion is much better, it also give the ability to control speed tweaking the shift + cfg, thanks!

Lannfield

Author

Jul 6, 2025

You’re welcome. That’s my intentions, you don’t need to use my wf. (If not I would’ve hidden everything). I’m actually happy that it helps you to understand and implement it in your wf ! :)

skyrimer3dJul 6, 2025· 2 reactions

CivitAI

Amazing wf, extremely complex but also really well organized, tons of options to make it work for your confguration with choice for gguf block swap etc, and also options for frame interpolation upscaling and more, amazing motion too even in longer videos

Lannfield

Author

Jul 6, 2025· 1 reaction

if you want 16fps, set the interpolation multiplier to 1 and speed to 1.

skyrimer3dJul 7, 2025· 1 reaction

@Lannfield Thanks but that's not what i want, i want to reduce native frames to be able to create a longer vid / get better quality, and then interpolate frames, not reduce interpolated frames which are fake frames. I found that it's controlled with the "Frames per Seconds per Interpo Multiplier", so i set a*12, then frame interpolate to 24 fps, which is the usual TV framerate and pretty acceptable visually, so that's what i usually use. In any case, thanks for this amazing wf, i'm going to play with all settings and see what works best or not ;)

nicccJul 8, 2025· 13 reactions

CivitAI

You must be autistic pal, you bloated it with ton of useless code to calculate most petty shit there is , clean this shit up. I am able to do the same thing with 3 times less nodes, you are ridiculous.

Do it smart way - use only comfy native nodes and wan nodes and get it to fit on 1080p monitor at 16:9 , the rest keep for yourself.I have all nodes installed but i pity others who will open it and get ton of red nodes cause you go aaaaaaaallllll around the world to get most bizarre nodes to calculate stupidest shit - for example upscale multiplier MY ASS.Pal we can math.

aniyaqinqinJul 9, 2025

同意

01hessiangranola851Jul 10, 2025

You seem nice.

cukurpapiks383Jul 10, 2025

I fucking love the workflow and all the descriptions. It is very beginner friendly!

solxrac781Jul 8, 2025· 1 reaction

CivitAI

WanImageToVideo

module 'tensorflow' has no attribute 'Tensor', Help pls

Lannfield

Author

Jul 8, 2025

not sure about this error.

this is the only possible solution i found:

https://www.reddit.com/r/comfyui/comments/1kockfd/attributeerror_module_tensorflow_has_no_attribute/

crombobularJul 9, 2025· 3 reactions

CivitAI

i don't understand the point of this at all. why use this wf which takes 5 minutes on a 3090 when you can just use the base wan model to get the same time? this completely negates using self-forcing in the first place.

blobby99Jul 11, 2025

A lot of split sampler and dual sampler workflows seem to have this issue. There may indeed be a benefit to these approaches, but the author's documentation never convinces this is so. But local video AI is a 'can of worms' and one must experiment and try comparison renders between differing workflow approaches.

As to your main point, I find there is a massive psychological impact between 2min and 10min render times, with regard to the desire to experiment with prompts and style LoRAs. However, I do find myself using speedup LoRAs (and models) in increasingly different ways.

PS complex workflows usually need the addition of 'model-clear' nodes at appropriate stages to avoid OOM. I only mention this because speedup methods encourage second stages in workflows (say using v2v for upscaling) and default ComfyUI has no sane memory management to speak of.

Lannfield

Author

Jul 12, 2025

@crombobular tldr: tradeoff some generation speed to get more motion.

Perhaps i didn't make it clear in my documentation. Without CausVid(CV) and Self-Forcing(SF), using dual sampling, one would generate using 10-15 steps first to get latent while refining the latent in the latter using 8-10 steps (rough estimate, not sure about the ratio) while having high CFG. This WF shrink that process down to just 8 steps (compare to 25-30 steps dual sampling/single sampling) with the use of the speed gen loras.

CV and SF plays an important role as we using very low steps. CV(5 steps) just creates more motion and also helps with speed gen, while SF(3 steps) just better in speed gen, natural motion but suffer from less motion.

It takes slightly longer time because we are using higher CFG in the 1st 5 steps. With SF alone using single sampler, CFG needs to be 1or it will generate weird effects. When CFG is 1 and especially when using it with other lora, the motion is sometimes hindered. That's why this wf is a propose solution to that problem. (Until someone comes up with a better solution)

If motion is not your main concern, then this approach is not needed.

crombobularJul 12, 2025· 1 reaction

@Lannfield got around to play a bit more with it. it definitely takes longer but movement is largely improved which is good for most nsfw loras. with my usual wan workflow i get ~70s per gen, with this workflow i get ~260s with the bonus of added motion. so depending on what you are trying to gen this is worth it if you need that motion

KevmonJul 9, 2025· 4 reactions

CivitAI

This workflow is amazing. But i do have some issues i am trying to figure out. Right now when i get to "1st Pass KSampler for Movement Latent" it takes almost 2 hours per Step. Im just assuming i have dont something wrong any idea's ?

Lannfield

Author

Jul 10, 2025· 1 reaction

Not sure what you did, if you are using GGUF, makes sure you are using both GGUF model and Clip together. Try lowering down you image dimension. Try not to go over 5 seconds for video length. Hope it helps.

KevmonJul 10, 2025· 1 reaction

@Lannfield GGUF Ahhhh. I wasn't using it. Is it Faster? I pretty much redownloaded the workflow. and ran it Default it was taking forever again so i did what you said and lowered the Scale by Width to 480p from 720p. When i was on 720 it was using 23g for nvram. it was really hogging it all up. At 480p its only using 19g and it seems to be moving now.

Lannfield

Author

Jul 10, 2025· 1 reaction

@Kevmon If you are using more vram, you can try adjust the block swap to 40. GGuf are good if you have lower vram. Also helps with initial loading of models.

blobby99Jul 11, 2025· 1 reaction

OOM issues cause collapse in iteration times. They are NEVER down to the model being too large, but the lousy code in ComfyUI idiotically using VRAM to 'store' model data. Models are LINEAR-ACCESS concepts, which for video AI means they should never ever be held in VRAM, but streamed from RAM across the PCIe GPU bus as needed each iteration (your RAM should be at least 40GB/s, your PCIe bus at least 16GB/s- more than enough for flawless streaming with near zero VRAM impact). In practice, this means inserting model unload nodes at various positions in the workflow.

Also, on my 16GB card I use "no smart memory" option to 'help' ComfyUI get it right. And I get BETTER render times (by a little) from none GGUF models (which are larger) because they cost less to get to the data contained within.

KevmonJul 11, 2025

@blobby99 i have 24gb of nvram and when i try to render the image at 720p with the workflow it gets hung up on wanimagetovideo node then after that slow on the KSamplers i would say like a hour per step..... but if i drop it to 512 it finishes in like 11min.

The no smart memory is that the command "--disable-smart-memory" by any chance ? would that even help me. my situation is im already kinda rigging background stuff running Comfyui Zluda because the whole AMD Gpu specs are below.

9800x3d 64gb 6000hz ram
7900xtx

Lannfield

Author

Jul 12, 2025· 1 reaction

@Kevmon I'm not sure what the equivalent between different brands of GPU. Mine is a 3090 Ti 24gb. I use 40 block swap along with GGUF model+Clip and my image dimension is 720x960, allows me to push for higher resolution and takes me around 10min per generation.

Example Scale by width:

Image: 720 x 960

Input width: 480

Result Video: 480 x 640

This keeps proportion. So I'm not sure what is the height value of your image. To slightly speed things up, you can also try lowering down Start CFG to 3.

And also sage attention + torch compile helps.

KevmonJul 12, 2025· 1 reaction

@Lannfield Thanks a ton with everyone's help i am getting much better times. Next is getting characters to move properly.

Workflows

Wan Video 14B i2v 480p

by Lannfield

Download (Beta) View on CivitAI