Good night everyone 🖤
I decided to share the ComfyUI workflow I use to create my videos.
The original version was taken from one of the standard SVI Starter Workflows, but over time it has changed so much that I can now confidently call it my own)
This worklow is equally well suited for both sfw and nsfw content.
Below I’ll briefly explain the main principles behind this workflow and how to use it.
Core principles of this workflow
1. No subgraphs.
I prefer a completely flat node structure. This allows me to clearly see every node, track execution time, and avoid unnecessary clicks navigating inside nested subgraphs.
2. Get/Set variables instead of spaghetti connections.
Parameters from the configuration section are passed using get/set nodes. This keeps the graph much cleaner and prevents huge tangles of wires across the workflow.
3. No NAG (Negative Attention Guidance).
All scenes use a single global negative prompt taken from the default WAN workflow.
This saves some valuable time by avoiding a separate CLIP Text Encode for the negative prompt in every scene. Instead, I focus all adjustments through the positive prompt only.
How to use this workflow
1. Model Set
First configure your main WAN model.
In my setup you can configure two different models. This allows you to:
• test different models for different scenes
• or generate two alternative versions of the first scene using the same prompt and seed
You can then compare the results and choose the model you prefer for the rest of the animation.
You can also duplicate the Model Set group if you want to test even more models.
Inside this section you can also enable or disable LightX LoRA.
Some WAN models already include LightX internally, while others expect you to load it manually.
Be careful here.
SVI LoRAs (High + Low) are always required.
2. WAN Setup
In this group you configure the base parameters of the animation:
• load your input image
• set the resolution (I prefer doing this manually so I always know the exact pixel size)
• define the number of frames (Length)
• choose the text encoder
• set the number of sampling steps
• configure the split step where the sampler switches from the High model to the Low model
3. Scene 1
The first scene is slightly different from the others.
Here the WanImageToVideoSVIPro node uses:
motion_latent_count = 0There is also no prev_samples input and no node responsible for merging video segments yet.
Because of this, Scene 1 uses a separate set of alternative generation groups.
Inside each scene block you mainly configure:
• the prompt
• optional LoRAs
• seed value
You might notice that the text node is separated from the positive CLIP Text Encode node.
This was done intentionally.
Earlier I experimented with inserting reusable text fragments (for example subject descriptions or camera descriptions) that could be reused as pre or post text. I removed this system later, but kept the text node separate in case I want to expand the workflow again in the future.
4. Scene 2 and later scenes
Starting from Scene 2, the process changes slightly.
Now the WanImageToVideoSVIPro node must receive:
• the latent output from the previous Low KSampler
• the decoded video connected to Image Batch Extend With Overlap as source_images
For subsequent scenes I use 4–6 alternative groups, identical to Scene 2.
When the next scene is needed, I simply:
• move those groups to the right
• reconnect them to the new latent from the previous scene
For convenience I added separate reroute nodes, because WanImageToVideoSVIPro has two latent inputs and it’s easy to accidentally connect the wrong one.
5. Post-processing
At the end of the workflow there are several groups responsible for:
• upscaling
• frame interpolation
These steps help produce the final polished result.
General workflow strategy
My process usually looks like this:
Configure the base setup.
Generate 1–6 alternative versions of Scene 1 using different:
• prompts
• LoRAs
• LoRA strengths
• seedsPick the best result.
Copy those parameters into the main Scene 1 group.
Yes — I am perfectly fine waiting 10 minutes for regeneration instead of constantly dragging node groups and reroute nodes around)
After that I move to Scene 2, repeat the same process with the alternative groups, choose the best result, and then copy the parameters into the main Scene 2 block.
Then I shift the alternative groups to the right for Scene 3, reconnect the latent from Scene 2 to the Alt 1 reroute node, and repeat the process again.
Step by step this builds the full animation.
Profit!
Bypass
Although the workflow contains a Fast Groups Bypasser, I don't use it - instead, I use the group bypass option from rgthree.
You can enable it in Settings -> rgthree -> rgthree-comfy settings -> Groups -> Show fast group toggles in Group Headers -> bypass switch
Mentions
I'd like to specifically mention some workflows that are truly good and worth paying attention to:
https://civarchive.com/models/1823089?modelVersionId=2580650
https://civarchive.com/models/2232205/wan-22-i2v-comfyui-workflow-svi-extend-flf2v-upscale
Also, I can't help but mention the models that I personally use in my work:
https://civarchive.com/models/1981116/dasiwa-wan-22-i2v-14b-or-lightspeed-or-safetensors
https://civarchive.com/models/2053259?modelVersionId=2609141
I mean, seriously, check them out, they're truly amazing! ❤️
This workflow is not meant to be a perfect piece of node art.
It’s simply my personal Swiss knife for WAN video generation.
If you have any suggestions, improvements, or questions — feel free to message me or leave a comment under this post. I’ll be happy to discuss it with you.
And of course, If you create something interesting with this workflow, I'd love to see it. ❤️
Good luck generating beautiful (and perhaps dark) content… 🖤
Description
First version of my workflow for creating 35 second WAN SVI animations in ComfyUI.
Main ideas:
• flat node structure (no subgraphs)
• clean parameter routing with get/set nodes
• shared negative prompt for faster generation
• scene-by-scene animation workflow
• equally suitable for both sfw and nsfw