Distilled 384 lora :
https://huggingface.co/Lightricks/LTX-2/tree/main
Detailer lora :
https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer/tree/main
Text encoder :
https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_fp8_e4m3fn.safetensors
Models :
https://huggingface.co/Kijai/LTXV2_comfy/tree/main
This workflow simply works as a detailer, worked for temporal consistency.Video resolution will not change. Works for 1280 x 720 5second and 10 second videos. There is a comparison video on civitai.
-Tested on 16GB VRAM.
Two workflows attached in zip folder,
First one is for 5 second videos.
The other is for 10 second.
If your ram is good enough (32+), you can make 10 second video on 5 second workflow too.
In that case sampling time will decrease.
But it is a test workflow, not claiming it to be the best, artifacts still appear.
If you get it done and your output video is like black or red grids,(Usually happens after you make like 10 20 videos). That means you need to restart your PC, because of VRAM throttling.
If you make video from like 5 sec workflow and shift it to 10 sec workflow and press run, you will get black screen or something like that. It is because different vae encoding method and cache, you need to close and open comfy.
Test results (16GB VRAM):
1280 x 720, 5 sec video detailing took 7 mins on 5 sec workflow.
1280 x 720, 10 sec video video detailing took 23 mins on 10 sec workflow.
1280 x 720, 10 sec video detailing took 14 mins on 5 sec workflow. (You need high RAM.)
------------------
Workflow needs different LTX installation process -
Follow the process below :
Download the custom nodes from the workflow below,
There is also video explanation for the nodes and models
Quantized models used.
Reference video to watch (Not mine) :
Models :
https://huggingface.co/Kijai/LTXV2_comfy/tree/main
Red Nodes under :
https://civarchive.com/models/2297090?modelVersionId=2584847
-a custom node from VantageWithAI named "Vantage GGUF Unet Loader" node
this node is experimental but more efficient than the regular unet GGUF nodes. (there is a PR pull missing from the GGUF nodes at the moment and until it is merged fully the node wont work without updating it in your command prompt yourself. i dont suggest as it is highly unusable. trust me, use the Vantage node.)
TO GET THE VANTAGE NODES, open a command prompt in your custom nodes folder and copy in this command and hit enter:
git clone https://github.com/vantagewithai/Vantage-Nodes.git
(i removed the "cd comfyui/custom_nodes" portion from the youtube video tutorial as its not needed)
use this command once its finished:
cd Vantage-Nodes
pip install -r requirements.txt
this installs the nodes fully. restart comfyui and you drag in my workflow and you have a fully functioning LTX-2 workflow!
I HIGHLY RECOMMEND UPDATING YOUR BAT FILE WITH THESE FLAGS:
--lowvram --disable-xformers --use-pytorch-cross-attention --reserve-vram 2 --disable-smart-memory
this workflow is very resource intense. even for the lowest quant my pc struggled until i lowered resolution to 480p. i suggest editing your bat file in notepad and just adding the flags to the code line that contains the "--windows-standalone" code and just save the file as a copy and use that one for LTX-2 ONLY. (rename it that if it helps you remember) instructions on how to do it are in the youtube video if you dont know how!
Workflow made by AITold.
Description
V1.0
FAQ
Comments (3)
I think you should lead with this video in your examples, because it makes the point far better:
https://www.reddit.com/r/StableDiffusion/comments/1qdtzur/ltx_2_video_to_video_detailer/
I seem to be having some trouble with this. It really messes up all the motion, even when the images themselves look nice. It seems to be doing a crazy amount of what looks like frame interpolation, even when it's outputting the same number of frames that I input. For example, on scene cuts, it creates a slow fade instead of cutting. Actions look very warped, like I'm frame interpolating x4 or x6, but again, same frames in as out.
What settings do you suggest I try to mitigate this? Also, if my video when multiplied times 1.5 goes to 720x1308, how can I get the output video to actually be that resolution? I know your notes say it's a maximum of 1280, but what is causing that limitation? Also, it's not outputting all the frames. The input video is 303 frames (301 is the nearest x8 +1), and I set the max number of frames in the video input node to that, but it outputs a maximum of 297. Do you know why it won't go to 301 at least?
Thanks for your help.
plz plz plz plz i have been using ur wan text to image for long time , plz make something like that for qwen image or qwen image 2512