Optimal Cunnilingus WAN 2.2 Long Video I2V Workflow (SVI / LongLook / Multi-Prompt)

After months of experimentation, I wanted to release my I2V cunnilingus workflow for WAN 2.2 to the public, after having much success in generating clips with it.

Note: This workflow can be used for other things than cunnilingus. I actually recommend this workflow for all your SFW and NSFW WAN 2.2 generations!

What this workflow achieves:

Preserves the natural detail and motion characteristics of base WAN 2.2 without relying on user-trained models.
Uses minimal third-party custom nodes, relying primarily on standard built-in ComfyUI nodes.
Fully transparent, straightforward, editable workflow design with no complex nodes, with all major logic exposed for easy understanding and modification.
Integrates Lightning LoRAs with a 3-KSampler setup for major speed gains while avoiding common slow-motion.
Allows flexible adjustment of steps, CFG, sampler, scheduler, and custom LoRA integration.
Allows multi-prompting for each clip, combining clip-specific actions with shared subject and setting prompts.
Includes sample pre-configured cunnilingus prompts with strong motion, detail, and prompt adherence while remaining highly adjustable.
Implements the current best and most effective LoRAs for cunnilingus and vulva appearance.
Implements female orgasm and ejaculation in the final clip.
Automatically stitches 3 clips into one longer video (roughly 18 seconds) using SVI-based overlap for smoother transitions between clips.
Integrates Wan FreeLong (LongLook) to improve long-range temporal consistency and reduce motion discontinuity between continuation clips.
Uses Wan Continuation Conditioning to pass the full previous clip sequence into the next clip through anchor images, improving motion carryover compared to single-frame continuation.
Applies automatic color correction between clips to reduce visible shifts in brightness and tone.
Automatically converts output from 16 fps to 48 fps using RIFE interpolation.
Allows generation of either long 18 seconds clip or a single clip when shorter output is preferred.

Current limitations:

While Wan Continuation Conditioning preserves visible motion and momentum very well, it does not maintain true 3D scene memory. If a character’s face, a specific clothing detail, or an important background or foreground object is fully obscured at the exact transition point between clips, the model may lose that visual information and generate a slightly different version when it becomes visible again in the next clip. To reduce this, prompts should describe important subjects and scene elements as clearly and consistently as possible. Simpler wardrobe, cleaner backgrounds, and less cluttered foregrounds tend to hold continuity better. It also helps to avoid aggressive camera moves such as heavy zooms, large pans, or compositions where key details disappear from view at the clip boundary. Character, clothing, and setting LoRAs can further improve consistency, though they do not eliminate the limitation entirely.
Each additional clip introduces a small amount of softness and quality loss due to repeated decoding, compression, and frame handoff. Even with sharpening, fine detail gradually degrades across later clips, which is why the workflow is currently limited to 3 total clips for best visual quality.

Design notes:

The workflow prioritizes transparency and editability over visual neatness. Nodes are intentionally left fully exposed rather than hidden or compressed, making the logic easier to understand and modify even if the graph appears dense (if the spaghetti bothers you, please turn it off).

The following models necessary for this workflow:

• WAN 2.2 I2V Base Model (Precision)

• WAN 2.2 I2V Base Model (GGUF)

• comfyUI-LongLook by shootthesound

The following LoRAs are necessary for this workflow:

• WAN 2.2 Lightning by Lightx2v

• SVI-Shot by Stable Video Infinity

• WAN 2.2 I2V Cunnlingus by K3NK

• WAN 2.2 T2V Cunnilingus by anonymist

• WAN 2.2 Pussy and Anus by HearmemanAI

• WAN 2.2 Pubic Hair by BlueB (Optional)

• WAN 2.2 I2V Orgasm by playtime_ai

• Cumshot Aesthetics by FLOW0380

Usage:

First and foremost, update your ComfyUI to the latest version.

Second, install all missing nodes via manager.

Caution: If you do not have Sage Attention installed, bypass all nodes labeled "Patch Sage Attention KJ" or else you'll receive an error. If you have Sage Attention installed, you must remove the --use sage-attention flag in your ComfyUI launch.bat before using this workflow.

User Adjustable Settings:

If a node/setting is not mentioned here, it is best to leave it alone!

Image Resize: Default width = 720 height = 0

To change the maximum output resolution, adjust either the width or height value while leaving the other set to 0. Only one dimension should be specified manually at a time.

For example:

To set a maximum width of 720, use Width = 720 and Height = 0

To set a maximum height of 720, use Width = 0 and Height = 720

The node will automatically calculate the remaining dimension while preserving aspect ratio and selecting values compatible with WAN 2.2.

Steps: Default = 10

You can change the total amount of steps. However, if you change the steps, you must also change the "Step End and Start 1" and "Step End and Start 2" values.

The recommended defaults are Steps = 10, Step End and Start 1 = 2, Step End and Start 2 = 6.

With this configuration, KSampler 1 runs from 0-2 steps, KSampler 2 runs from 2-6 steps and KSampler 3 runs from 6-1000 (end) steps.

The steps percentages are as follows:

KSampler 1 = 20%

High-noise initial phase where Lightning LoRA is disabled.

This phase establishes the global scene structure, subject placement, and initial motion direction before acceleration begins. Disabling Lightning here helps avoid slow-motion starts, unstable early motion, and overly forced movement in the first frames.

KSampler 2 = 40%

High-noise Lightning LoRA phase at strength 1.0.

This is the main motion-building phase, where large body movement, pose transitions, and action timing are formed. Lightning is strongest here because high noise is where major temporal motion is decided.

KSampler 3 = 40%

Low-noise Lightning LoRA phase at strength 1.0.

This final phase refines facial detail, clothing texture, hand accuracy, and motion continuity while preserving the movement established earlier. Low noise no longer creates major motion, but stabilizes and sharpens the final result.

Here is a cheat sheet that you can simply plug in without doing any math for the total amount of steps you want:

10 steps: Step End and Start 1 = 2, Step End and Start 2 = 6

15 steps: Step End and Start 1 = 3, Step End and Start 2 = 9

20 steps: Step End and Start 1 = 4, Step End and Start 2 = 12

25 steps: Step End and Start 1 = 5, Step End and Start 2 = 15

30 steps: Step End and Start 1 = 6, Step End and Start 2 = 18

For standard quality and fastest output, I recommend sticking to the defaults Steps = 10, Step End and Start 1 = 2, Step End and Start 2 = 6.

For higher quality Steps = 20, Step End and Start 1 = 4, Step End and Start 2 = 12 is ideal.

Higher steps than 20 have high increased generation times and very diminishing quality returns with Lightning LoRAs enabled, so I generally do not recommend them unless you have a powerful GPU and lots of patience.

CFG 1 = Default 2.8

A higher CFG in the first phase helps lock in prompt intent early, giving the model stronger guidance while scene composition and subject identity are still being established. This keeps the initial structure from drifting before motion begins.

CFG 2 = Default 1.8

CFG is lowered during the main motion phase so movement can develop more naturally without becoming over-constrained by prompt pressure. This improves motion fluidity and reduces stiffness when Lightning LoRA is actively driving action.

CFG 3 = Default 1.3

A low CFG in the final refinement phase allows details to settle cleanly without forcing extra prompt corrections that can introduce flicker or instability. This helps preserve coherence while sharpening facial features, clothing, and small motion details.

Only change these if you know what you are doing and trying to achieve.

Sampler: Default uni_pc

Scheduler: Default simple

Again, these are the best samplers and schedulers for WAN 2.2. Only change these if you know what you are doing and trying to achieve.

FreeLong:

You can enable or disable FreeLong. Default is FreeLong disabled to save generation time.

FreeLong helps with temporal stability and motion consistency between clips; however, it is VERY SLOW (taking on average 2X-3X longer than without). Therefore, it is only recommended to enable FreeLong if absolutely necessary. Currently, with Wan Continuation Conditioning and SVI-Shot, most of the time FreeLong should not necessary.

Clip 1, 2 and 3 Initial Prompt/Camera Prompt:

This prompt defines the overall shot composition and camera behavior for each clip. Briefly describe the desired camera framing, angle, movement, lighting, and color tone.

If the source image is not photographic, clearly specify the visual style (for example: anime, cartoon, or CGI).

Clip 1 Female/Male (or Female/Female) Subject and Setting Prompt:

This prompt describes the subject or subjects already present in the starting image, along with the environment.

Describe the appearance of each subject as thoroughly as possible, including hairstyle, hair color, facial features, makeup, body build, clothing, clothing colors, and other defining details. Also describe the starting pose, such as whether the subject is standing, sitting, or lying down, and mention any object they are interacting with (for example, a chair, bed, floor, or vehicle).

The setting should also be described in detail, including location, furniture, background elements, lighting sources, and overall atmosphere.

Note: This prompt only needs to be written ONCE. It is automatically reused for Clips 2 and 3, since subject appearance and setting usually remain consistent across all clips.

Clip 1, 2 and 3 Action Prompt:

This prompt controls the motion and actions performed in each clip.

Describe what the subject or subjects should do during each clip, keeping actions clear and sequential. For best results, it is recommended to keep the default wording mostly intact and change only the specific actions needed, as excessive prompt changes can reduce cunnilingus motion reliability.

Clip 1, 2 and 3 Negative Prompt:

This prompt defines what should be avoided in each clip.

The provided defaults are recommended, as they are tuned for stable output. Only add extra negatives if there are specific unwanted elements or behaviors you want to suppress in a particular clip.

Description

FAQ

Comments (2)

Details

Files

optimalCunnilingusWAN22LongVideo_v10.json

Mirrors

optimalCunnilingusWAN22LongVideo_v10.json

Mirrors

Description

FAQ

What is Optimal Cunnilingus WAN 2.2 Long Video I2V Workflow (SVI / LongLook / Multi-Prompt)?

What files are available and where can I download them?

Comments (2)

Details

Files

optimalCunnilingusWAN22LongVideo_v10.json

Mirrors

optimalCunnilingusWAN22LongVideo_v10.json

Mirrors