本工作流较为复杂,但我仍然希望初学者可以流畅使用,所以我内置了许多说明笔记。
This workflow is relatively complex, but I still hope beginners can use it smoothly, so I have included many explanatory notes.
这是一个具有非常好的效果的,VRAM友好的(RTX3060 8GB完美运行,更低配置的机器也可以运行)的基于提示词的I2I工作流,内置多个调整开关,可以基于上传的图片生成姿势图、深度图、Canny边缘图等五种为Controlnet使用的图片,采用IPAdapter来保证生成人物的形态和着装精准,五种图片预处理模式和IPAdapter与ControlNet开关的搭配可以达成:仅参考人物姿势生成图片、参考人物和背景生成图片,高度参考人物和背景生成图片、仅参考人物着装和外貌生成图片、仅参考背景生成图片。
您可以在了解IPAdapter和ControlNet的工作方式后使用本工作流,本工作流使用了较多的第三方节点,这里不列出,所有的第三方节点和我使用的模型都可以在github和huggingface上找到。
This is a prompt-based I2I workflow with excellent output quality that is VRAM-friendly — it runs perfectly on an RTX3060 8GB, and can also run on lower-end machines. It includes multiple switches and options, and can generate five types of ControlNet guidance maps (such as pose maps, depth maps, Canny edge maps) based on the uploaded image. It uses IPAdapter to ensure accurate generation of the person’s appearance and clothing. The combination of five preprocessing modes and toggles for IPAdapter and ControlNet allows you to achieve multiple generation styles:
generate based only on the person’s pose,
reference the person and the background,
strongly reference both person and background,
reference only the person’s appearance and clothing,
or reference only the background.
1️⃣ IP-Adapter(人物外观 / 着装控制)
下载链接 / Download
https://huggingface.co/h94/IP-Adapter
放置文件夹 / Folder
ComfyUI/models/ipadapter/2️⃣ ControlNet 模型(姿势 / 深度 / 边缘)
统一下载入口 / Download (ControlNet models)
https://huggingface.co/lllyasviel/ControlNet
(在该页面中选择 pose / depth / canny 等模型)
放置文件夹 / Folder
ComfyUI/models/controlnet/3️⃣ SAM3(人物分割 / 轮廓 / 抠图)
下载链接 / Download
https://huggingface.co/1038lab/sam3
放置文件夹 / Folder
ComfyUI/models/sam/(或 sam3/,取决于你使用的节点说明)
本工作流可用于 SD1.5 或 SDXL,但请确保 IP-Adapter 与 ControlNet 模型版本与底模一致:
使用 SD1.5 时,请下载 sd15 标注的 IP-Adapter 和 ControlNet
使用 SDXL 时,请下载 sdxl 标注的 IP-Adapter 和 ControlNet
模型版本不匹配会导致生成异常或直接报错。
This workflow supports both SD1.5 and SDXL. Make sure the IP-Adapter and ControlNet model versions match your base checkpoint:
For SD1.5, use models labeled sd15
For SDXL, use models labeled sdxl
Mismatched versions may cause incorrect outputs or runtime errors.
该工作流的设计目标是在保证生成质量的前提下,尽可能降低使用门槛与硬件压力。整体结构采用模块化与开关式设计,用户可以根据实际需求自由组合不同控制模块,而无需修改核心流程。工作流在默认配置下已经具有良好的稳定性,即使不深入理解每一个节点,也可以通过提示词与少量参数调整获得可靠的输出结果。同时,所有关键控制逻辑都保持可视化与可追溯,方便用户在熟悉之后逐步深入理解其工作方式,并在此基础上进行二次扩展或定制。
This workflow is designed to balance high output quality with low hardware requirements while keeping the learning curve approachable. Its modular, switch-based structure allows users to combine or disable control components as needed without modifying the core pipeline. With default settings, the workflow is already stable and usable, enabling reliable results through prompt input and minimal parameter adjustments, even without deep knowledge of each node. At the same time, all major control logic remains transparent and easy to inspect, making it suitable for users who wish to gradually understand the workflow and further customize or extend it.
Description
v1.0 是该工作流的首个正式稳定版本,标志着整体结构与核心设计已经完成并经过实际验证。本版本以稳定性、可复用性和显存友好性为优先目标,在保持较高生成质量的同时,确保在中低端显卡环境下也能顺畅运行。
在这一版本中,工作流的模块划分与控制逻辑已经固定,所有关键功能均通过可切换的方式集成,避免了对核心流程的频繁修改。整体流程更偏向于“可使用的工具”而非实验性示例,适合作为日常 I2I 生成的基础模板使用。
后续版本将在保持当前结构的前提下,逐步优化参数预设、节点组织方式以及文档说明,而不会对核心生成逻辑进行破坏性调整。
Version 1.0 is the first official stable release of this workflow, marking the completion and validation of its core structure and design. This release prioritizes stability, reusability, and low VRAM usage, ensuring smooth operation on mid- to low-end GPUs while maintaining high generation quality.
In this version, the workflow’s modular layout and control logic are finalized, with all key features integrated through switchable components to avoid frequent changes to the core pipeline. The workflow is designed as a practical production tool rather than an experimental example, making it suitable as a base template for daily I2I generation tasks.
Future updates will focus on parameter refinement, node organization, and documentation improvements while preserving the existing structure and avoiding disruptive changes to the core generation logic.






