DONT USE THIS
It was a fun run. I2V proper is released, download that. I'll leave this up only for a weird old initial way of janking a i2v process before it was released
Final version! (probably)
V4 introduces the Refinement speed hack (works great with a guiding video which depthflow uses)
Flux re-enabled
More electrolytes!
This I think is where I will stop. I have had a lot of frustrating fun playing with this and my other backend workflow for the speed hack, but I think this is finally at a place I am fairly okay with. I hope you enjoy it and post your results down below. If there are problems (always problems), post in the comments also. I or others will try to help out.
Alright Hunyuan. balls in your court. how about the official release to make this irrelevant. We're all doing this janky workarounds, so just pop it out already. btw, if you use this for your official workflow, cut me a check, I like eating.
btw, check out the other workflow on here, the leapfusion thing It actually works pretty well. less control over what you're going for, but closer to the original picture. both are cool to have.
Final update: (HA!)
Added Hunyuan Refiner step for awesomeness
Streamlined
Minor update:
V3.1 is more about refining.
Removed Reactor (pulled from Github
Removed Flux (broken)
Removed Florence (huge memory issue)
Denoodled
Added a few new options to depthflow.
V3: ITS THE FINAL COUNTDOWN!
Alright, this is probably enough. someone else get creative and go from here, but I think I am done messing around with this overall and am happy with it...(until I am not. Come on Hunyuan...release the actual image 2 video)
Anyhow, tweaks and thangs:Added in Florence for recommendation prompt (not attached, just giving you suggestions if you have it on for the hunyuan bit)
Added switches for turning things on and off
More logical flow (slight overhead save)
Shrink image after Depthflow for better preservation of picture elements
Made more stroking colors (Follow the black) and organization for important settings areas
Various tweaks and nudges that I didn't note.
V2:
More optimized, a few more settings added, some pointless nodes removed, and overall a better workflow. Also added in optional Flux group if you want to use that instead of XL
Added in also some help with Teacache (play around with that for speed, but don't go crazy with the thresh..small increments upwards)
Anyhow, give this a shot, its actually pretty impressive. I am not expecting much difference between this vs whenever they come out with I2V natively...(hopefully theirs will be faster though, the depthflow step is a hangup)
Thanks to the person who tipped me 1k buzz btw. I am not 100% sure what to do with it, but that was cool!
Anyhow
(NOTE: I genuinely don't know what I am doing regarding the HunyuanFast vs Regular and Lora. I wrote don't use it, and that remains true if you leave it on the fast model..but use it if using the full model. Ask for others, don't take my word as gospel. consider me GPT2.0 making stuff up. all I know is that this process works great for a hacky image2video knockoff)
XL HunYuan Janky I2V DepthFlow: A Slightly Polished Janky Workflow
This is real Image-to-Video. It’s also a bit of sorcery. It’s DepthFlow warlock rituals combined with HunYuan magic to create something that looks like real motion (well, it is real motion..sort of). Whether it’s practical or just wildly entertaining, you decide.
Key Notes Before You Start
Denoising freedom. Crank that denoising up if you want sweeping motion and dynamic changes. It won’t slow things down, but it will alter the original image significantly at higher settings (0.80+). Keep that in mind. Even with 80+, it'll still be similar to the pic though.
Resolution matters. Keep the resolution (post XL generation) to 512 or lower in the descale step before it shoots over to DepthFlow for faster processing. Bigger resolutions = slower speeds = why did you do this to yourself?
Melty faces aren’t the problem. Higher denoising changes the face and other details. If you want to keep the exact face, turn on Reactor for face-swapping. Otherwise, turn it off, save some time, and embrace the chaos.
DepthFlow is the magic wand. The more steps you give DepthFlow, the longer the video becomes. Play with it—this is the key to unlocking wild, expressive movements.
Lora setup tips.
Don’t use the FastLoRA—it wont work using the fast Hunyuan model which is on by default. Use it if you change the model though
Load any other LoRA, even if you’re not directly calling it. The models use the LoRA’s smoothness for better results.
For HunYuan, I recommend Edge_Of_Reality LoRA or similar for realism.
XL LoRAs behave normally. If you’re working in the XL phase, treat it like any other workflow. Once it moves into HunYuan, it uses the LoRA as a secondary helper. Experiment here—use realism or stylistic LoRAs depending on your vision.
WARNING: REACTOR IS TURNED OFF IN WORKFLOW!
(turn on to lose sanity or leave off and save tons of time if you're not partial to the starting face)
How It Works
Generate your starting image.
Be detailed with your prompt in the XL phase, or use an image2image process to refine an existing image.
Want Flux enhancements? Go for it, but it’s optional. The denoising from the Hunyuan bit will probably alter most of the Flux magic anyhow, so I went with XL speed over Flux's clarity, but sure, give it a shot. enable the group, alter things, and its ready to go. really just a flip of a switch.
DepthFlow creates movement.
Add exaggerated zooms, pans, and tilts in DepthFlow. This movement makes HunYuan interpret dynamic gestures, walking, and other actions.
Don’t make it too spazzy unless chaos is your goal.
HunYuan processes it.
This is where the magic happens. Noise, denoising, and movement interpretation turn DepthFlow output into a smooth, moving video.
Subtle denoising (0.50 or lower) keeps things close to the original image. Higher denoising (0.80+) creates pronounced motion but deviates more from the original.
Reactor (optional).If you care about keeping the exact original face, Reactor will swap it back in, frame by frame.If you’re okay with slight face variations, turn Reactor off and save some time.
Upscale the final result.
The final step upscales your video to 1024x1024 (or double your original resolution).
Why This Exists
Because waiting for HunYuan’s true image-to-video feature was taking too long, and I needed something to tinker with. This (less) janky process works, and it’s a blast to experiment with.
Second warning:
You're probably gonna be asked to download a bunch of nodes you don't have installed yet (DepthFlow, Reactor, and possibly some others). Just a heads up.
Final Thoughts
This workflow is far from perfect, but it gets the job done. If you have improvements, go wild—credit is appreciated but not required. I just want to inspire people to experiment with LoRAs and workflows.
And remember, this isn’t Hollywood-grade video generation. It’s creative sorcery for those of us stuck in the "almost but not quite" phase of technology. Have fun!
Description
Added optional Flux workflow.
Streamlined various throughput
Added in Teacache for Hunyuan process
FAQ
Comments (31)
Thank you! Could you maybe add some generation time estimations and GPU you use?
GPU is a 3090 TI
Gen time...lets go with a 75 frame video:
XL 960x960
XL rendering (25 steps): 5 seconds
Depthflow (75 frames): 4 seconds
Hunyuan rendering: 37 seconds
Total time: 130 seconds (84 of that going from node to node loading models)
Run 2 (with upscale turned on:
XL: 3 seconds
Depthflow: 3.5 seconds
Hunyuan: 37
Upscale: (unknown...no report)
Total time: 137 seconds
@saturngfx Nice, thank you! That's much better than I would guess!
Pretty cool!
Nice one, thanks for sharing!
Glad you're having fun. Its something to get by until Hunyuan releases the image 2 vid for real anyhow
RuntimeError: shape '[1, 25, 7, 9, 16, 1, 2, 2]' is invalid for input of size 112000
Having this error with my custom picture. Is there some size restrictions?
Thats your image resize factor node leading up to depthflow...right after the XL image rendering.
Set it at 0.18 or 0.25, then restart comfyui. I run into that if I make changes. real pain in the butt. some math stuff going on. if you change it and don't restart, it moans about it the whole time. I am looking forward to someone taking this workflow and altering it to get past these weird errors. I have no doubt I did thing hamfisted, but just showing proof of concept more than anything so someone who knows comfy well can make things better.
alright, divisible by 16 for the end result. grab the latest version and look at the little legend area. that should keep you in check as to what sizes to stick with.
i just use i2v , so all i do is add a image resize after load image , and set w1440 h1800 , that fix this error for me
nah
Yes. This workflow is magic. It works and is good.
Cannot execute because a node is missing the class_type property.: Node ID '#39'
Node 39 (for me) is reactor...are you trying to do a face swap with the OG? turn off the group to see if it works without face swap first, then if it works fine, thats when debugging reactor starts. Just bypass the group for a test
did you figure it out? I am getting the same error message. Works... in some way, if I don't have reactor enabled.
@woodenpickle Does reactor work normally just running it on a simple workflow? like just...a regular XL image gen?
@woodenpickle Reactor I guess is discontinued. yanked from github. sad, but can't say I am overly surprised. Well, new version has it removed.
Notice and DL version 2... <update the page five mins later> Version 3 is out. Seems about right given how things are going these days, I imagine v4 will be done by the time I'm finished writing this.
lol...naa, I am done. just felt V2 was rushed.
That might be a very dumb question, because I'm new to hunyuan, but where do I get the Clip model? (Llava_llama3)
https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers/tree/main
Goes in ComfyUI\models\LLM
also in comfyui, (assuming you have the manager) you can search for it in the options menu under model manager
It was working, then it stop with this error.
"Input type (float) and bias type (struct c10::Half) should be the same"
Any ideas?
give a bit more. where does it give the error at? disable all things and start from the beginning. can it generate an image? okay, enable depthflow...can it do that? okay, next enable hunyuan, etc...find where the error occurs.
can it run on 12gb?
I don't have a 12g to test, but give it a shot. if you could report back here to let the community know if you can, that would be great...help others with a 12 know.
I would say you might start small. no upscaling, run maybe 50fps, and go a bit small, then work up from there once you find your starting...then see how much you can push it until it flops. there is nothing overly heavy in the model. but it can be slow going from area to area.
@saturngfx Can confirm this works on 12gb I have 4080 12gb vram laptop 32gb Ram, out the box it works :-) many thanks great workflow
@Aicush Thanks. common card, your testing is appreciated. :)
@Aicush super
Note - Several of the nodes used in your workflow are no longer available on GitHub due to a legal takedown. Here is the author's thread on the matter: https://www.reddit.com/r/comfyui/comments/1i3bsb8/github_killed_reactor_repo/
Short version: This workflow is a paperweight unless you can work around or replace the ReActor nodes.
grumbles Thanks for the heads up. next version might be a callback to Roop...or meh, just nothing I suppose.


