I originally tried to whip up a quick LoRA with just 100 ingredients, but the flavour never came through. So I cleared the counter, started from scratch, and slow-cooked a full fine-tune instead. I gathered over 8,000 diverse visual ingredients and let them simmer at a whisper-low learning rate for over 648,000 steps. Low heat. Long time. No shortcuts.
This model started life as Blackhole, a full fine-tune of Qwen-Image-2512. At epoch 81, the flavour got too strong, signs of overtraining appeared. So I pulled two batches: epoch 71 (balanced) and epoch 81 (intense), and blended them 70/30.
📋 The Pantry (Training Data)
• Yoga & movement references
• Martial arts & combat sports photography
• Candid mobile shots & everyday selfies
• Animation & illustration style references
• Traditional art & painting studies
• Human skin & anatomy close-ups
• over 6,000 assorted lifestyle & action captures
👨🍳
Every sample image on this page includes the exact prompt and ComfyUI workflow I used. Think of them as recipe cards: download the workflow, drop in the prompt, and you’ll get the same dish. Tweak the seasoning to your taste.
⚖️ Simple Ethical Note
This model is a full fine-tune of Qwen-Image-2512 (Apache 2.0). Use it responsibly for creative, personal, or educational projects. Respect likeness and copyright when crafting your own scenes. Commercial use is at your own discretion.
Description
FAQ
Comments (9)
Looks amazing but the size of the file, oof
Thanks, I uploaded two versions: fp8mixed (about half the size with almost the same quality) and nvfp4mixed (the quality drops a lot).
what tools did you use to fine-tune it? :) I would welcome any advice I can get :D
Kohya_ss https://github.com/bmaltais/kohya_ss
@eagle1980 much appreciated!
Thank you for this. Qwen is my go to platform for generating and nice to see other people working with it. Downloading now, will give it a try. Btw- How long did it take to train and what vram requirements are there?
Preparing the dataset took a couple of weeks. I started training on my RTX 3090, and it took about one and a half months to finish 21 epochs. After that, I continued training on an RTX 6000 Pro using SimplePod. I haven’t really tested it myself yet.
@eagle1980 I had no idea it was possible to full finetune Qwen on 24GB. I see you did it with kohya_ss. Do you mind sharing your training settings?
@TimmyHodor Yes, you can train with 24 GB VRAM and 128 GB system RAM, but it is very slow. At resolution 1328, with 35 blocks to swap, using full bf16 and no FP8 flags, I was getting about 20 to 24 seconds per iteration. I will share the training command and a dataset.toml example when I get free, maybe by the end of the week.

















