NOTICE ~Make sure to download the correct version for the tool you use 😉~ NOTICE
Wan Self forcing Rank 16
Some other versions of this have been uploaded, however those are less compatible with other LoRA and the internal structure prevented me from using them with Blissful Tuner (My advanced, extended Musubi Tuner: https://github.com/Sarania/blissful-tuner/ ). This is a Rank 16 extraction of https://huggingface.co/lightx2v/Wan2.1-T2V-14B-StepDistill-CfgDistill for use with compatible sampler to allow high quality 4 to 8 step diffusions with Wan architecture models. Both a Blissful compatible version and Comfy compatible version are provided. They target only the minimum params necessary to further improve compatibility.
Instructions for use:
You need a compatible sampler such as LCM, flow_shift = 8.0 and guidance_scale/CFG = 1.0 - this means no negative prompt(Edit: See comment below about how to get better quality + negative prompt). However in exchange you get exceptional quality, quick video diffusions with pretty much any Wan arch model - I've tried it with T2V, I2V, and Skyreels V2 myself and it works perfectly all around! It also works with other LoRA that I've tried - the kissy showcase video here was made with my AmorousLesbianKisses model and the four wheeler video was made with https://civarchive.com/models/1698719/high-speed-dynamic?modelVersionId=1922492 ! To offset the lack of CFG/negative prompt, you can use prompt weighting. Also I've had luck using CFG on just the first step with a low value like 1.6 if you REALLY need that negative.
I like to use 6 steps for T2V and 4 for I2V. Showcase videos (832x1104@81f) were made in about 5 minutes on my 4070 Ti Super and then they were VFI'd from 16 fps to 32fps in another couple minutes. Please note that I did not train the distilled model, credit for that goes to lightx2v - thanks to them for releasing their model under permissive terms! I've been waiting for a distilled Wan that was up to my quality standards and this is it. IMO it's superior to others such as CausVid/AccVid. As mentioned above I've had success not only with T2V, but also I2V and even Skyreels V2! As such it's likely most any Wan derived model would be compatible.
Please note that as mentioned, versions of this have been uploaded before. However, I felt it was still worth posting these due to the compatibility issues mentioned above, and also my extraction is smaller and more targeted for better compatibility with other LoRA.
P.S. If you want to extract your own LoRA from Wan or other diffusion models, the script I used is one of the many extras in Blissful Tuner https://github.com/Sarania/blissful-tuner/blob/main/src/blissful_tuner/extract_lora.py
Description
FAQ
Comments (58)
Is this different from any of the files here?
Different LoRA
Thats what im wondering, although poster says this one is more targeted to work with other loras, so its def worth a try!
Yes as I mentioned, that LoRA is Rank 32 and affects more structures (embedding, head etc) inside the model so it tends to play less well with other models and also not be compatible with certain software(e.g. my own for instance). If you use Comfy, your main reason you might try it is it should be more compatible with other LoRA. It's also saved in full precision to maximize it's accuracy per rank(hence it's size) whereas the other is in float16. But it's extracted from the same base model as the "self forcing" one linked. Feel free to use whatever works best for you!
@blyss thank you mate, im about to try it.
Early signs, this lora is a vast improvement working with other loras! Thank you so much for sharing with us @blyss 🐐
@compo6628585 I'm glad to hear that! You're very welcome, please enjoy!
@blyss Any workflow examples? Not faster for me, and my generators are poor quality with your lora. I use comfyui.
@blyss Nice, eager to try it out
@Mikebobby I don't have a workflow to share as I don't use Comfy myself anymore so I tested with just the default workflows. But as mentioned in the description you MUST use a compatible sampler/scheduler, such as just "LCM", regular samplers won't work well and can produce awful results. You also need to set CFG/guidance to 1.0. Not doing either of those would cause what you are reporting so that's my best guesses as to why you are having issues.
What lora strength is recommended?
1.0 !
Is it work with FusionX merged?
Edit: From looking at that model, if it's already an accelerate 4 to 8 step model then there would be no benefit.
@blyss if u use the ingredients wf u can swap out CausVid with this lora
@vrgamedevgirl Oh that's cool, I haven't used Comfy myself in a while so I'm not familiar with what all workflows are available. Thanks for sharing!
Cnn this be used on pinokio for wan? or is meant for Comfyui?
One of them should work but I'm not sure which of the two predominant LoRA formats Pinokio uses. I would try the Comfy one first (it's in the format that Diffusion-pipe makes which is what's most common on here CivitAI), then if that doesn't work you can try the Blissful one which is in Musubi Tuner format
Edit: I say that but I've not used Pinokio myself, but as long as it has an LCM sampler you should be good to go. Otherwise, you might request one!
this is incredible! i was doing 480x288 clips before due to it not so long generation time(about 140-180sec), now with this lora and a couple of new nodes im doing 848x540 clips with same waiting time! And it seems like its a lot better quality when i was generating higher resolution clips before. sick!
I'm really glad to hear it works well for you! It was the same experience for me and when I realized the lower rank version I'd extracted was more compatible with other LoRA I just had to share. Please enjoy!
Thank you! This is amazing! Using in comfy and the result is perfect and fast!
Can you share a workflow/metadata please? Thank you!
I can't really as I don't use ComfyUI anymore for video diffusion. For my test of the Comfy version I just used the example WanVideoWrapper T2V workflow(delete TeaCache, Enhance-A-Video nodes(not tested, likely not compatible), set scheduler to LCM, shift to 8.0, cfg to 1, steps to 4-8, add "WanVideo Lora Select" and connect and select this LoRA. That's what I did.) Sorry! Perhaps someone else can?
Any basic worksflow will work even this from ComfyUI exemples. Just add this as Lora with LoadLora (Model only) and this is it.
I am far from the most experienced, but this is remarkable. The quality in half the time or even less..
Excellent!
After some significant testing, i can conclude this Lora is currently the best of all the accelerators out atm. It is amazing at keeping the quality at a very good level with minimal steps. BUT...like all the others, it still has motion issues, not as much tho, and plays better with some/other loras.
Im still finding using a two sampler system negates the motion issue, but obviously this impedes the speed of generation. Still significantly quicker than base wan, and worthy of any WF. Thanks for sharing 👍
++++
I am also impressed with the quality and the speed generation. However, I have observed that there is frequently a lack of movement in many scenes, where it tends to be.
How do you setup a two sampler system? Are you doubling all of the guider\sampler\sigmas\noise nodes and put denoised output of 1st to latent_image of the second? And do i need to put this LoRa on 2nd? With what parameters? cfg\steps etc? thx
@kalamba I simply added a 2nd ksampler, and connected it like the 1st one. then connected the 1st to the 2nd thru the latent image. then the 2nd to vae decode. im not a WF guy, more a bodge it together from reading other peoples comments. you may have much more luck looking at the comments in this loras section:Self-Forcing / CausVid / Accvid Lora
@compo6628585 do you find that with a second ksampler you lose a lot of details from the initial image, like face and stuff, that is if you are doing i2v?
@slaad0 No i dont have tho's problems with consistency of characters. i still do have problems with motion with certain loras tho, even using 2 samplers 😢
WHat base model you using ?? Wan2.1 t2v 14b fp16 fp8 or gguf version ?
To extract or inference? In either case I use Wan2.1 T2V 14B in mixed fp16/fp32 precision myself. My actual inferencing is done in fp8_scaled (Blissful Tuner loads the base model in mixed precision, merges any LoRAs requested, then scales all the fp16 weights to fp8 using a scaled quantization. The fp32 stuff remains fp32 as these are the norm, bias, head layers that only represent a fraction of the model's footprint but significantly benefits from full precision!) This allows me to attain maximum quality on my 16GB of VRAM!
Have you looked into NAG to restore the negative prompt?
I've been meaning to try it, but so far SelfForcing's motion has been an issue and if you have something described in the prompt that the model doesn't love as much as something else then CFG=1 hurts.
For example, a "Thai Woman" will loose a ton of her Thai-nessity in CFG=1 compared to CFG=4. Even if you weight it heavy.
Really need CFG=4 + a SpeedUp.
Yeah I've run into a few things like that, for instance making any night/dark outputs that are truly dark is an issue. I've tried a few methods of getting CFG with self forcing but nothing has really worked well. That said it works in many more situations than it doesn't and the quality of outputs at 720p is super, super nice. The benefits more than outweigh the drawbacks IMO!
@blyss If you haven't, try NAG (maybe wrapper over native, not sure)!
Use a double sampler workflow with causvid first and lightxv2 second, you can keep the cfg that way and even control the motion intensity, it's a bit slower but much better motion. example: https://civitai.com/models/1622023?modelVersionId=1938620
I remember you from the Musubi git issues. Looks like you've been busy enough!
Any change of getting the example workflow from the showcase videos? My T2V has never been that good and I can't see why.
Haha yeah I've been deep into the mess with generative video! Unfortunately as I mentioned in another comment I don't really use Comfy anymore myself for video gen(Never could get the quality I wanted plus not as easy to tinker) and I tested my LoRA just in a default workflow. I personally use my extended Musubi, Blissful Tuner and that's what the showcase videos were done with. It's CLI but that's fine for me personally and I've added a lot of stuff to squeeze out the max quality I can on lower end hardware. Things like mixed precision transformer(weights kept in fp16 and then scaled to fp8 at inference time, while bias/norm/head/etc params kept in full fp32 always), I've made tweaks to eek ALL the accuracy I can out of fp8_scaled by upcasting the quantization calculations and scale weight, upcasting linear transforms for multiply headroom, etc etc. I'm honestly astonished and super pleased at the quality I'm getting these days! Especially with this accel LoRA which makes full 720p gens a sub five minute affair and the improved structural quality at that res!
Sorry I can't help though!
I can confirm this is amazing, used it in the ingredients wf instead of the causvid lora, it has great image quality and can produce 10 sec vids at 848 x 480 with good motion on my 16gb VRAM card and within minutes, fantastic acceleration lora.
Damn! Works really well!
for I2V it's working better than causvid\Self-Forcing(lightxV2)\etc
in comparison It's just perfect.
great quality, great movement,
tested with parameters:
Lora strength=1.0
steps=8,cfg=1 - 3.5(!!!),sampler= euler_a,scheduler=simple,shift=8
(tried comfyUI version)
@DRZ3000 can you share your creations using this lora you mentioned above
Just want to ask a dump question....
Is it possible to use the Native Wan2.1 Workflow with Kijai version Wan2.1_14B_I2V model with this selfforcing Lora?
Does anyone have suggestions on using this with other Loras? I never get anywhere near the normal output of the other Lora as they're meant to perform while I'm using this. I'm suspecting it's because to use the Self Forcing you have to use a CFG 1. I've been getting closer results and no visual issues using a CFG 1.5. I've increased the other Lora to a higher weight and it's still not working. I am using the proper prompt for the other Lora, just to be clear.
First I installed the usual lore influence increased from 0.35 to 0.5, and then the lore accelerator and everything worked out
@devold5000 hi! sorry, which lora influence are you referring to? When you say installed, you mean used in a workflow? I appreciate your help, but I do not follow or understand if this helps me with my question. Thank you, regardless.
@woodenpickle Please see my comment about updated usage for better performance overall. Other than that, for all the LoRA I've tried it just works. Quite well - I'm currently running it with three other LoRA at the same time. Is there a particular one you are having issue with?
@blyss If u dont mind me sticking my nose in blyss, these are the loras im having probs with. its a shame as theyre two of my favs, and possibly cos of the way theyre trained (made by the same team)? :
https://civitai.com/models/1428098/rev3rse-c0wg1rl-wan-21-t2v-and-i2v-lora
https://civitai.com/models/1395313/wan-dr34mjob-doublesinglehandy-blowjob
Usage update: As suggested by reviewer @qdr1en - "I got better results though by lowering the LoRa strength to 0.8 + increasing the CFG to 1.3-1.4 + selecting dpm2pp as sampler."
and I can confirm that lowering the strength to 0.8, using a dpm scheduler (I've personally dpmsolver++ and dpmsolver++ sde), and optionally adding a small CFG value can noticeably boost quality/enable negative prompt(ofc CFG will increase gen time). Also adaptive guidance can help too if you wanna use CFG. Thanks qdr1en and others! Hope it helps, cheers!
Thanks! i finally can generate with less than 12 steps with that suggestion and looks good.
is dpmsolver++ and dpmsolver++ sde just dpmpp and dpmpp_sde?
Well, unfortunately, on my 4060 ti 16 gb this new method doesn't work. Generation of 5-sec video with 16 fps with 640x832 resolution still takes approximatley 30 mins. Only main method from instruction works for me.
I can't see any improvement over the original lora. Same generation time, slightly worse quality, and same movement/compatibility of loras. I don't know if I'm doing something wrong.
I have to agree, i dont see much improvement at all, other than occasionally a random motion moment may happen...which is great, but could easily be down to random seeding. As i previously said i feel although this lora is a massive improvement in terms of quality to speed generations, it just doesnt play well with some other loras. The ones it works well with are great tho.
By using "Wan21_T2V_14B_lightx2v_cfg_step_distill_lora_rank32", I get up to 4 hours of rendering (720p) using a 6gb GTX card (it works but with 7fps). Then I use Topaz Video Enhancer AI (chronos) to double or triple FPS.
But.... By using your "wan_lcm_r16_fp32_blissful" (weight 0.8), it's taking 12 hours or more.
loaded partially 128.0 127.998046875 0
Attempting to release mmap (725)
50%|██████████████████████████████████████▌ | 2/4 [6:22:08<6:22:12, 11466.34s/it]
Are you seriously waiting 12+ hours for a few seconds of video? I feel like even a tree would get impatient here. And what could possibly make that amount of time and energy worth it? Plug in the computer through a Kill A Watt meter. It has a little indicator that turns on when you've spent more on power than a decent GPU costs. Depending on where you live, you might not have to wait 12 hours before it starts blinking.
This unique Lora surprisingly works on non-standard checkpoints where neither Caus nor Light2V succeeds. I am extremely grateful to you for this Lora.
Not just this great Lora BUT Blissful Tuner! - Next level Missubi Tuner spin off! .. You are some kind of mad scientist super genius! Appreciate you amigo! Can't wait to see what Frankengen creation you come up with next!
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.