Click here to try online first:
Workflow: Lip-Sync Speaking/Singing – LTX2.3 Image-to-Digital Human – Auto Expansion – Module Optimization – No Subtitles
Experience link: https://www.runninghub.ai/post/2038618856104665090/?inviteCode=rh-v1401
Workflow: Text-to-Lip-Sync Video – Speaking/Singing – LTX2.3 Text-to-Digital Human – No Subtitles – Module Optimization
Experience link: https://www.runninghub.ai/post/2038618886479814658/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – Text-to-Video
Experience link: https://www.runninghub.ai/post/2031218445026594817/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – Image-to-Video – Modular Tuned Edition
Experience link: https://www.runninghub.ai/post/2031218459471777794/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – First/Middle/Last Frame Three-Image-to-Video
Experience link: https://www.runninghub.ai/post/2035325465820405761/?inviteCode=rh-v1401
Name: LTX 2.3 Image-to-Lip-Sync Meme Workflow (Modular / Ultra-Fast / Action-Supported)
【Name】
LTX 2.3 图生对口型鬼畜工作流(模块化/超快/支持动作)
Introduction:
Built on the open-source LTX 2.3 model, optimized for image-to-lip-sync videos. It allows any image (people/animals/medium-close-up) to accurately sing or speak along with the uploaded audio, while controlling actions (walking, waving, jumping, etc.) via prompts.
【简介】
基于LTX 2.3开源模型打造,专为图生对口型视频优化。可让任意图片(人物/动物/中近景)随着上传的音频精准唱歌或说话,同时通过提示词控制动作(走路、挥手、跳跃等)。
Core Advantages:
- Extremely fast: a 10-second 1280-resolution video takes only 3-6 minutes; even faster on second run
- Batch 5x: tested running 5 workflows simultaneously, producing a dozen finished videos per day
- Modular grouping: upload → dimension setting → audio → Latent creation → upscale; clear and easy to modify
- With fixed shots, it's almost impossible to tell generated clips from original; perfect for memes/entertainment/vtubers
- Supports MP3 audio (if error occurs, re-export once from CapCut)
- Avoid prompts like "look down" or "turn around" as they break character consistency
【核心优势】
- 速度极快:1280分辨率10秒视频仅需3~6分钟,工作流第二次运行更快
- 5开批量:实测同时跑5个工作流,一天产出十几个成品
- 模块化分组:上传→尺寸设置→音频→Latent创建→放大,一目了然,易于修改
- 固定镜头下几乎无法分辨生成与原片,适合鬼畜/娱乐/虚拟主播
- 支持MP3音频(如遇报错,用剪映重新导出一次即可)
- 避免提示词:低头、转身等会破坏人物一致性
Workflow Structure:
1. Upload image (medium/close-up, clear lip movements)
2. Set dimensions (longest side 1280)
3. Upload audio (10-15 seconds recommended)
4. Latent module references both image and audio, scaling at the same time
5. Final upscale and output
【工作流结构】
1. 上传图片(中近景,口型清晰)
2. 尺寸设置(最长边1280)
3. 上传音频(推荐10~15秒)
4. Latent模块参考图片+音频,同时缩放
5. 最终放大出片
Results Showcase:
This workflow has been used to create the "round-headed elderly meme singing" video (see example). Speaking lip-sync is equally excellent; paired with Qianwen voice design, it can be used for digital humans.
【效果展示】
已用本工作流制作“圆头耄耋魔性唱歌”鬼畜视频(见示例)。说话对口型同样优秀,配合千问声音设计可做数字人。
Note:
LTX2.3 is the open-source model closest to cinema-grade in texture and color control.
【注意】
LTX2.3是开源模型中质感、色彩控制最接近影视级的模型。
Click here to try online first:
Workflow: Lip-Sync Speaking/Singing – LTX2.3 Image-to-Digital Human – Auto Expansion – Module Optimization – No Subtitles
Experience link: https://www.runninghub.ai/post/2038618856104665090/?inviteCode=rh-v1401
Workflow: Text-to-Lip-Sync Video – Speaking/Singing – LTX2.3 Text-to-Digital Human – No Subtitles – Module Optimization
Experience link: https://www.runninghub.ai/post/2038618886479814658/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – Text-to-Video
Experience link: https://www.runninghub.ai/post/2031218445026594817/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – Image-to-Video – Modular Tuned Edition
Experience link: https://www.runninghub.ai/post/2031218459471777794/?inviteCode=rh-v1401
Workflow: LTX2.3 – Fully Automated Prompt – First/Middle/Last Frame Three-Image-to-Video
Experience link: https://www.runninghub.ai/post/2035325465820405761/?inviteCode=rh-v1401
Name: LTX 2.3 Image-to-Lip-Sync Meme Workflow (Modular / Ultra-Fast / Action-Supported)
【Name】
LTX 2.3 图生对口型鬼畜工作流(模块化/超快/支持动作)
Introduction:
Built on the open-source LTX 2.3 model, optimized for image-to-lip-sync videos. It allows any image (people/animals/medium-close-up) to accurately sing or speak along with the uploaded audio, while controlling actions (walking, waving, jumping, etc.) via prompts.
【简介】
基于LTX 2.3开源模型打造,专为图生对口型视频优化。可让任意图片(人物/动物/中近景)随着上传的音频精准唱歌或说话,同时通过提示词控制动作(走路、挥手、跳跃等)。
Core Advantages:
- Extremely fast: a 10-second 1280-resolution video takes only 3-6 minutes; even faster on second run
- Batch 5x: tested running 5 workflows simultaneously, producing a dozen finished videos per day
- Modular grouping: upload → dimension setting → audio → Latent creation → upscale; clear and easy to modify
- With fixed shots, it's almost impossible to tell generated clips from original; perfect for memes/entertainment/vtubers
- Supports MP3 audio (if error occurs, re-export once from CapCut)
- Avoid prompts like "look down" or "turn around" as they break character consistency
【核心优势】
- 速度极快:1280分辨率10秒视频仅需3~6分钟,工作流第二次运行更快
- 5开批量:实测同时跑5个工作流,一天产出十几个成品
- 模块化分组:上传→尺寸设置→音频→Latent创建→放大,一目了然,易于修改
- 固定镜头下几乎无法分辨生成与原片,适合鬼畜/娱乐/虚拟主播
- 支持MP3音频(如遇报错,用剪映重新导出一次即可)
- 避免提示词:低头、转身等会破坏人物一致性
Workflow Structure:
1. Upload image (medium/close-up, clear lip movements)
2. Set dimensions (longest side 1280)
3. Upload audio (10-15 seconds recommended)
4. Latent module references both image and audio, scaling at the same time
5. Final upscale and output
【工作流结构】
1. 上传图片(中近景,口型清晰)
2. 尺寸设置(最长边1280)
3. 上传音频(推荐10~15秒)
4. Latent模块参考图片+音频,同时缩放
5. 最终放大出片
Results Showcase:
This workflow has been used to create the "round-headed elderly meme singing" video (see example). Speaking lip-sync is equally excellent; paired with Qianwen voice design, it can be used for digital humans.
【效果展示】
已用本工作流制作“圆头耄耋魔性唱歌”鬼畜视频(见示例)。说话对口型同样优秀,配合千问声音设计可做数字人。
Note:
LTX2.3 is the open-source model closest to cinema-grade in texture and color control.
【注意】
LTX2.3是开源模型中质感、色彩控制最接近影视级的模型。