Update v2: tripled dataset + completely new captioning,
Should be working ok now (doesn't mean it won't produce bad anatomy some of the time, but for me at least it's useable now)
The model knows some gags and has a lot of variety in perspectives.
Often doesn't play well with other (concept-) lora, mix them in with care.
tested with flux1dev + acornisspinning
i recommend a wide or square aspect ratio.
V1:
I'll just leave this here. It's highly inconsistent and i highly recommend using the pony version instead.
If you insist on using this one, try to write a prompt excruciatingly detailed and be prepared for wasted time on failed generations.
training notes v1:
cogvlm - one sentence + danbooru tags.
3000 Steps
lr 0.0001
dim alpha 16/16
Description
FAQ
Comments (7)
Tip of the day- try the lora at ~2.0
ha, that does seem to get the pose more consistent. -- seems the way to go
Have you tried training with a really low dim count? I was really surprised to find that training my most recent model with Dim 2 worked way better than 16/16. It took ~1600 steps to converge, but after it had way fewer artifacts.
Haven't had a chance to test more coherently (training takes for fucking ever), but hope it might be a good lead.
no i didn't. lowest i did was 8 i think. with what learning rate did you train?
@Joschek I cargo-culted a bunch of meta-parameters from someone that were super different from what I had been doing before. I think the final rate was 0.0005 on Cosine w/Restarts (I had been going back and forth between Constant @ 0.0002 and 0.0004 on other attempts). They also had custom params for the AdamW8bit optimizer weight_decay=0.01 eps=1e-08 betas=(0.9,0.999). I'm not sure where this approach came from, but I think it might actually be what Civitai's internal trainer uses?
The final result wasn't perfect, but it was lightyears better than it was working before. It's especially surprising to me because it's not a small push, it's not just a matter of recombining the latents. Flux can't draw dicks at all so in theory it's needing to teach how to draw ~most of the frame. I do think it came out much less creative, though, so it might be carbon-copying the concepts more than it's generalizing them.
I haven't tested to confirm the impact of all of the params, but I tried reverting to 16/16 again to isolate that one factor, and confirmed that dim 16 had a bunch more body horror, weird anatomy, and other random details transferred. I'm going to try the same approach on some other LoRA's and see.
@thegipper so i retrained with dim 2 with lr 0.0001 and 0.0005... didn't yield better results from what i can tell (doing a proper xy plot takes me probably a day on flux) but also probably not worse. thanks for the suggestion though!
@Joschek Thanks for checking back in on your results.
I also ended up having some time to train and test another LoRA with that approach, and mine also didn't turn out better (probably slightly worse), so I guess no silver bullet there.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.

