This is the video that i made for ltx 2 .
Check it out and want models links and all check out my discord - https://discord.com/channels/1350055016987365436/1457085486056214598/1458455052871143434
Description
FAQ
Comments (51)
What's your take on LTX 2, Better than Wan 2.2?
15sec+ length, high resolution + speed generation. possible voice from text.
It's kinda amazing for the reasons already mentioned. That being said - I'm having more issues with prompt adherence with it than I do with Wan 2.2. I'm early into my experimenting with it but if I can address the prompt adherence stuff (and if good Loras are possible) then I will be ditching Wan 2.2. Mostly because of the speed improvement. I don't necessarily find the video quality that much different in FP8. I also really like the audio generation being integrated into a single pass.
No...
I think this has the potential to be better than wan 2.2 . POTENTIAL :D
I think it's amazing except for the censorship, which someone will probably fix soon by releasing a NSFW checkpoint. A newer&better model could come out next week though. Which seems to happen faster and faster these days. It feels hard to invest in a model when it often just becomes obsolete after a week.
I run out of memory. Any way to make this work on a GPU with 16GB memory and offload to System memory?
@skeetfontenot496 check the other comments claiming that pagefile.sys will do the trick. I don't have doubts it will work. I just didn't try it since I got 192BG RAM
@Prefo try asking chatgpt for a motion prompt followed by the action you want, this fixed it for me
Anybody try N S F. W content generation ? Asking for a friend 😉
Hi everybody, I'm the friend ...
I'll follow this model, but without the "right" kind of loras it's a non-starter for me.
it is, as all this things are, trained on poisoned data to ensure censorship. High vram costs will likely prevent much lora training. (hope I am wrong) the default template on comfy Out of memory errored my 3090 on default settings. hence looking for a new workflow.
I've just tried a simple nsfw i2v with just some skin, the first 1-2 frames are your nsfw image and it instanlty fades into to a simple brown background, so it's lobotomised af - don't waste your time and just use Wan instead.
EDIT: I was so wrong, another test generated a perfect NSFW (RIP Wan2.2): https://civitai.com/images/116853218
@bnzarev821 you won't get any nsfw output without lora's but LTX2 definitely is not a waste of time. see my profile for latest uploads. not perfect but someting you won't pull off with wan that easy. (go fullscreen for audio)
@AI_man2025 works fine on 3060 for me, so you should absolutely have no problems with 3090. Just make sure you have set enough vitrual memory (google "pagefile.sys" for more info), mine in 200GB, but 300GB is even better if you have enough disk space
@bnzarev821 I'm using the default comfyUI ltx2 template. + the 40gb safetensor. (rtx 4090 + 192GB RAM). give it a try.
It's a massive memory hog and Distorch is out of the question. Fortunately ComfyUI has recently improved offloading to where's it's usable on Linux. The text encoder is also awful (like Qwen's) and to be usable it will need at least the fp8 abliterated version, if not a functional Q5. It's still too early to say.
I should add that the 4-step upscale pass makes it look much, much worse. I am getting much better results just by doing everything at full res in one pass.
@janssenbutton your videos are very good, the ones with sound too. BUT, can you make vids where she's sucking a wang or are you relegated to just light fondling? Asking for me. Heh
@MMOFan just like for wan back then, we need to wait for the amazing people training loras for this concept.
@bnzarev821 what did you change? It lets me get pretty close but anything nude at all it sucks. Im using a custom character lora too.
@Artban69 I used the Kijai's "distilled" model "ltx-2-19b-distilled-fp8_transformer_only.safetensors" instead of the original "dev". You could also try "ltx-2-19b-distilled_Q8_0.ggup": https://huggingface.co/Kijai/LTXV2_comfy/tree/main/diffusion_models
Thanks for the workflow, the generation speed itself is incredible !
This workflow seems to produce worse outputs than the official workflow
The text to video one I could not get to work, I need to look at that more but it looks like it is set up with aspecific scene in mind and thus specific lora's etc.
The image to video one goes out of memory right at the start, on my 3090, on linux...i think this workflow is for 5090 only?
you can fix this by changing the height / width , lowering the length or the fps to 16
How do you create longer videos?
Change the length by increasing it from 480 (19s video). Put a screenshot showing the length and fps in chatgpt then ask it to calculate the length for 10s, 15s, etc.
if you have enough ram+vram, for 15 seconds type 15*24+1 in the frame int node
@Protagonist_NL for a rtx 5090, what is the longest pength video you can do 1280x720p @ 16fps?
@grasshopper85116 i could render a 2 minute video, but i never do because in real filmmaking there are seldom scenes that last that long
@Protagonist_NL somehow how i managed to render a 20 second LTX2 2560x1408 video @ 16fps in 13 minutes with my rtx 5090.
The sharpenss of the output was amazing!! But after the generation my pc became almost unresponsive until I restarted it.
@grasshopper85116 20 sec is the sweet spot, if you go past that then youll see some weird shit
thanks for the workflow!
[enforce fail at alloc_cpu.cpp:121] data. DefaultCPUAllocator: not enough memory: you tried to allocate 146819089374594048 bytes.
How do I fix this error?
Change the length by lowering from 480 (19s video) or changing the height / width or change to 16fps. Put a screenshot showing the length and fps in chatgpt then ask it to calculate the length for 10s, 15s, etc.
how to solve this error? or my pc can't handle it? i'm on 3060 12gb vram and 16gb ram
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.
Requested to load VideoVAE
loaded completely; 9825.80 MB usable, 2331.69 MB loaded, full load: True
E:\Comfy>echo If you see this and ComfyUI did not start try updating your Nvidia Drivers to the latest.
If you see this and ComfyUI did not start try updating your Nvidia Drivers to the latest.
E:\Comfy>pause
Press any key to continue . . .
im waiting for ggfu checkpoint and gguf text encoder combo, tried doign meself, but got error
@lotu5 i said gguf text encoder combo...
I have a 4090, 32gb ram, 24 vram and this workflow keeps on crash my entire pc, how can you all manage to generate with this?!
you need 32gb vram at least for the full model. Just buy an RTX Pro 6000 or use runpod or sth similar.
run the ggufs
@ExcrutiatinglyExplicit Or use ggufs or other methods to reduce vram usage within reasonable limits? XD
@Latent_Dreamscape NEVER
@ExcrutiatinglyExplicit "just buy an rtx pro 6000", "if you're homeless, just buy a house"
How do you run this with a gguf? the model loader only accepts full checkpoints, theres no unet loader that also does ltx audio vae
They always talk. It looks so stupid 😆
The stage1 sampler node has "output" and "denoised_output" and the "output" is connected to stage2. Is this intentional?