IF YOU ARE WONDERING WHAT V2 IS OR YOU'RE HERE BECAUSE OF MISMATCHES AFTER UPDATING COMFY THE FIX IS IN!
UPDATE KJNODES AND COMFY!!!!!
WE NOW PUT THE EMBEDDINGS INTO THE MODEL LOADER AND THE CLIP BUT IT IS ONLY LOADED ONCE! DOES NOT USE MORE MEMORY!
UPDATE KJNODES AND COMFY!!!!!
THIS WAS A NECESSARY CHANGE AFTER COMFY REORGANIZED WHERE THE EMBEDDING MODEL WOULD LOAD INTO. REQUIRED UPDATE!
PLEASE TAKE NOTE OF NEW DEV+LORA COMBO! WE NOW USE FP4 GEMMA TEXT ENCODER!!!! CHECK MODELS!!! WE NOW HAVE PREVIEWS USING TINY VAE!!! CHECK MODELS!!!! CHECK MODELS!!! DID I MENTION TO CHECK ALL YOUR MODELS!!! DO EEEEEEET!
WE ALSO HAVE THE LORAS SETUP CORRECTLY AND THERE ARE SOME FUN ONES OUT ALREADY! NODE IS READY TO GO FOR YOU!
5 TOTAL GGUF 12GB WORKFLOWS!
t2v, i2v, v2v extend, ta2v, ia2v!
Hello everyone! This workflow has come a long way since 1.0 actually. It doesn't seem like it when you first look but, boy this has been a project for me!
Here we have quite a few workflows for LTX-2 using GGUF and running on at least 12GB VRAM and 48GB system ram.
First we have your typical t2v and i2v workflows.
Second we now have two new audio driven workflows! ta2v which is supply a text prompt ONLY and some audio get a neat generated video with your audio! The other is an ia2v where you supply an image and an audio file and it lip-syncs up nicely. I tried to keep everything as simple as possible.
Then the one I like the most v2v extend. Feed ltx2 a few seconds of video, create a prompt to continue the video and watch the magic happen!!
I got done with the workflows, I now need to get all the info out there but I wanted to get these into the wild so everyone can start having fun with them!
I HAVE CREATED TWO ENHANCEMENT NODES FOR THE AUDIO!!
YOU WILL NOTICE 2 NEW NODES TOWARD THE END OF THE WORKFLOW FOR AUDIO ENHANCEMENT. CLICK THE BLUE LINK BELOW FOR MY GITHUB PAGE, INSTALLATION INSTRUCTIONS, AND USEAGE NOTES!
URABEWE-COMFYUI-AUDIOTOOLS
Description
V2V Extend! Take your favorite movie scenes and make them how you want them!
FAQ
Comments (60)
Video extending is crazy....LTX 2 can do the movement that he never learned from a video its like real time training from a video with voice cloning together....magic
It really is... I have been changing scenes in movies and making things right. Go back and save your favorite character, don't let Artax die in the swamp, go back and set things the way they used to be and make Han shoot first, or in my case... make Uma Thurman make a joke about Quentin Tarantino's obsession with feet. I mean, the possibilities are endless!
@Urabewe I picked a video of my ex girlfriend...a video that she send me years ago and....yea i dont wanna go in details...this tech is crazy....we living in a good time...thank you so much for your workflows..
I don't have v2v_mode in DSRE node even if I updated everything and installed missing nodes. I don't know what I'am doing wrong. Also, taeltx_2.safetensors won't work for me. Giving 'mismatch' errors.
Other than that. Those all are amazing workflows. Thanks!
yeah i'm not sure. sometimes when you get a node and there is an update you may need to delete the node out of the workflow and then put it back in. the previews... update the gguf nodes and teh kjnodes and see if that helps. make sure the tae is in "vae approx" when using it as well it won't show up until you update everything and will give that mismatch with gguf and kjnodes not updated
yep i get a vae error too. its in the right folder and it selected.
ok i deleted everything to do with the previews and brought in the sampler from the default i2v workflow. it now do create an output. however, the output is all garbled.
@mikerotchburns1978885 okay so lots of people having problems with with previews. Gotta make sure gguf, kjnodes, and comfy are all up to date and on main branch. Bringing in the sampler from another workflow messed things up probably. Reload the workflow from the json and done delete.
The vae node looks in vae and vae approx, the support for the ltx tiny vae is new and if it isn't working that means your stuff is not fully updated. The workflow is fine, it is your comfy that is keeping the node from using the ltx vae. It is also not needed to make videos
Bypass the vae loader for the tiny vae, bypass the ltxpreviewoverride node as well and you should be able to get a video
Also for the audio nodes update that node pack and you will get the v2v label. The node was fully functional but v2v was labeled auto because I uploaded the wrong file to the repo which has now been fixed.
So the audio nodes are now updated to show the v2v mode it was just labeled as auto still which I didn't like so now it is v2v as should be.
The tiny vae for the previews is new and support was added not long ago. You have to make sure that gguf, kjnodes, and comfy are all fully up to date and on main branch to get them to work. And yes the vae goes in vae approx.
the load vae node does not see taeltx_2.safetensors in vae_approx
@Urabewe yes i did bypass the preview. the video fully outputs. however its just the input image still for the duration. is this just my prompt?
@mikerotchburns1978885 nope! that's just the fun of ltx2. So i have a lora node hooked up with the distill lora. The trick with LTX2 is to put a camera control lora in there and set the strength to 0.5. That will fix the static images.
https://huggingface.co/Lightricks/LTX-2-19b-LoRA-Camera-Control-Static/tree/main
@Urabewe yep that fixed it! thanks!
I might be dumb, but where's the option to change the steps? I don't see the scheduler anywhere in the T2V workflow :(
ah yeah this is a set of sigmas that represent the steps. they are set for 8 steps and 3 steps. each number represents a step from 1 to 0. You can change them. You could also replace that with other scheduler/sigma nodes.
@Urabewe Came here for the same question! Thanks!
Can't load ComfyUI-AudioTools.
Installed everything (librosa is needed too)
It shows loaded in console but I get "This workflow uses custom nodes you haven't installed yet." when tryin to load the WF
Thanks for the feedback, I have updated the requirements for the nodes now. try uninstalling and reinstalling the node pack again and see if that helps. If not can you show a screenshot of the popup? Thanks!
@Urabewe Yes, it loads now.
This can help too:
please go to the ComfyUI-Manager directory and execute the command git update-ref refs/remotes/origin/main a361cc1 && git fetch --all && git pull
From the ComfyUI-Manager Troubleshooting
I asked Google AI Studio for help. It was a bit messy at first because I had two custom nodes that weren’t working, but after some installing, uninstalling, and reinstalling, we managed to fix everything! I can’t share the chat because it’s in Italian, and also because the solution depends a lot on your setup, Python version, and so on… but feel free to ask Google AI Studio if no one helps you!
@MaximilianPs Already fixed thanks
I don't know what's going on but I've done EVERYTHING correctly as per the install instructions and I'm STILL getting this error every single time.
I'm close to giving up altogether on this workflow unless this issue can be resolved.
@Espilonarge if it's just the audiotools one then delete them and you can just gen without them. You will also need to update to the v2 of the workflows as there was an update with comfyui that required a different node to load the model.
with i2v workload, i always get 3-4 seconds of static original image then it generates the next frames. what do i do? t2v works wonderfully
add a static camera lora from ltx at 0.5 strength. for whatever reason sometimes you get static frames and adding a camera lora seems to make it work. as long as you don't use the static camera triggers or ask for a steady camera or anything it should still just make your video fine.
@Urabewe Thanks for your hard work on this workflow. for the static lora, I used this one. https://huggingface.co/Lightricks/LTX-2-19b-LoRA-Camera-Control-Static it seemed to work.
@boinobin730
https://huggingface.co/MachineDelusions/LTX-2_Image2Video_Adapter_LoRa/tree/main
Hot of the presses from another user. Supposed to be better than the camera lora trick. This is specifically for i2v ia2v to fix static frames.
@Urabewe Thanks man. Doing good work.
This is fantastic! I tried the I2V and T2V (version 2) and they worked great. Great job @Urabewe !
I am getting issues with your Vae taeltx_2 for Image to Video, How do I fix this?
The error is a long one so ill post it below
Error(s) in loading state_dict for TAEHV: size mismatch for encoder.0.weight: copying a param with shape torch.Size([64, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3]). size mismatch for encoder.12.conv.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for decoder.7.conv.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1, 1]). size mismatch for decoder.22.weight: copying a param with shape torch.Size([48, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 64, 3, 3]). size mismatch for decoder.22.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([3]).
A lot of people having issues with that. You have to make sure the gguf nodes, kjnodes, and comfy are all up to date and on the main branches and not nightly.
That tae was recently added into comfy so if anything is out of date it won't work.
You can bypass the tae vae loader node and the ltx preview override node by the prompt boxes and that should get you going if updating doesn't help.
That is for the previews in the sampler nodes and isn't necessary for the actual video generation
@Urabewe Thanks for letting me know, I did update it thinking it may have been that, so ill try your suggestion and just try to update again in the upcoming days.
@Megasherru any updates? having same issue on v2v
@b12 Hi yeah, Urabewe had the solution.
You Bypass the Vae node, and the node its connected to.
works from there.
Although, for image to video, the image remains static for around 40% of the video.
sound plays, but the image remains still, then suddenly starts doing what you prompt.
@Megasherru that's a quirk with ltx. They have these camera control Lora stick one of those in the workflow at 0.5 strength doesn't matter which one then gen your i2v as usual and it will move
There are is also a Lora out now for this very reason a i2v helper Lora that I will be testing soon. Should work better than the camera control loras
The Image + Audio workflow generates a 5 second still image with audio.
LTX-2 Image Audio 2 Video GGUF 12GB UPSCALE TWO SAMPLER.json
I got the same issue... for me it is not a completely still image but a video with extremely little movement, especially in the first 4 seconds. The audio sounds as in slow motion.
Same here. I tried increasing cfg, better prompting, increasing lora strength. Nothing significant.
@boinobin730 found the answer in another comment - add a video lora with low strength, I used the "dolly in" one and now it works fine for me.
@futacrafts thanks for the response, i will give it a try.
@futacrafts Thanks. It worked. I used this one. https://huggingface.co/Lightricks/LTX-2-19b-LoRA-Camera-Control-Static
Worked perfectly for me.
Also, just in case anyone else has been getting wierd mismatched tensor errors, try deleting ComfyUI_smZNodes, it seems to be a pretty big culprit.
Very good workflows. Maybe V2V workflow needs updating, there is small colorshift problem when extended part starts. look 1:47-> (and next clip too)
https://youtu.be/vqRs6n84gW4
yeah it was quite a bit of testing just to get it to this point. I'm not updating anymore until LTX releases a few more updates. I don't have time to constantly make new workflows so they just released thsee new nodes, I will test those. Then they have an updated model coming.
I'm also still tweaking and looking into many different settings for all these workflows. For now it is what it is, you can try tweaking the ltximgtovideoinplace nodes as that was what got LTX to not completely change the face of the person.
@Urabewe i test that tweak. oh, yeah. i hope new ltx model coming soon!
nice but please bruh add model sources here so we can dl in advance for your awesome workflow
I have a problem with Clip Text Encode and I received this message:
How can I solve this? I'm using RTX 5090
Not sure about that one I would have to see more of the logs. Are you running any startup args?
@Urabewe Watch the video and the log and see if you can find the problem. I'm using the RTX 5090 32GB GPU rented in the cloud, see the video:
my video: https://gofile.io/d/m78J9I
my logs: https://gofile.io/d/4NlI8e
UPDATE: I managed to solve it with the RTX 4090 24GB GPU. So the workflow does not fit with the larger video card from the RTX 5090 upwards. That's why I got error message for CLIPTEXTENCODE: "not supported because: requires device with capacity <= (9, 0), but GPU has capacity (12, 0) (too new)".
This is an amazing workflow; I've tried many workflows before, but this one works miraculously. Thank you to the developer who made this possible.
Prompt execution failed
Node 'ID #182' has no class_type. The workflow may be corrupted or a custom node is missing.: Node ID '#182'
IDK why I keep getting this. I have installed all the dependencies in the Markdown Note and the two other nodepacks stated in the workflow.
Look around for red boxes, did you get the audio tools from my repo as well? If not then it might be those. I suggest using them for sure. You can install through comfy manager directly just search for Urabewe to find them
@Urabewe I just deleted the last two nodes for the Audio. It started working after that. I did install those from the blue link you gave here but for some reason they aren't working
@Artmagnet I had another user talk about that I think maybe there is a version mismatch that is causing it or something.
If you installed the nodes you might have them if you search for dsre you may find the node. Then you can use that and the normalize one to make ltx have way better audio though it's not magic.
Using the suggest lora always produces much worse results.
아무런 오류도 없는데 오디오투비디오는 영상도 안움직이고 목소리만 그냥 나옵니다. 설정은 해상도와 길이만 수정했습니다
이미지를 비디오로, 이미지 오디오를 비디오로, 텍스트 오디오를 비디오로 변환할 때 이미지가 정지되어 있거나 영상과 립싱크가 맞지 않는 경우가 있습니다. 이미지-비디오 워크플로우의 경우 다음 링크를 참고해 보세요.
https://huggingface.co/siraxe/MachineDelusions_LTX-2_Image2Video_Adapter_LoRa/tree/main
텍스트-비디오 변환은 음성 안내와 오디오가 매우 중요합니다. 말하는 사람을 최대한 잘 묘사하는 음성 안내를 사용하세요. 하지만 주변 소음이 심하거나 음성이 너무 작으면 제대로 인식되지 않을 수 있습니다. 오디오 게인을 높이거나, 음성만 분리하여 후처리 과정에서 다시 합치거나, 원본 오디오를 저장 노드에 입력하는 등의 방법을 시도해 볼 수 있습니다.
Been trying out the I2V workflow a lot, and I must say it works wonderfully, excellent job! I am curious though, is there any chance of getting an FLF2V workflow? I like having the start and end frame to help ensure consistency throughout, though LTX-2 is definitely a lot better about maintaining consistency throughout than other models I've seen. I don't know enough about ComfyUI workflow engineering to be able to add a First Frame-Last Frame functionality to this.
I was working on one but got a lot going on right now. I'll have to return to the workflows soon considering all the new updates.
Doesn't run for me. I have updated ComfyUI and all dependencies. No errors or red boxes in the workflows, I have used the original jsons, no changes. But only mismatch-videos. Pity.
