Description
(for best results, read the full description - usage guide below)
This is a merge of some random anime based and cartoon based models to achieve a somewhat cartoony anime style, more similar to what you would actually see in anime as opposed to the more common hyper-detailed anime models.
Versions 3, 4, and 4.5 include some custom training to further enhance the style. More details available in "About this Version" on the sidebar. Most positive prompts for the v3 sample images were randomly generated.
Usage Guide
(highly recommended) Use a negative embedding for best results
I use verybadimagenegative_v1.3 (all examples use this)
verybadimagenegative_v1.3
Place the downloaded file into the "embeddings" folder of the SD WebUI
In the negative prompt, paste "verybadimagenegative_v1.3"
(highly recommended) Upscaling at 2x (or more) is important to getting a good result. I would recommend the following settings:
Denoising strength of 0.45
Use the "R-ESRGAN 4x+ Anime6B" upscaler for a flatter look, or use "Latent" for a bit more detail
Leave hires steps at default of 0 (equal to your generation steps)
(highly recommended) Use DPM++2M Karas as the sampler. Other samplers can yield odd artifacts, though your mileage may vary depending on your specific setup.
(only for version 3 and below) Use the dynamic thresholding plugin (all example images do with cfg scale 10 mimic 7): https://github.com/mcmonkeyprojects/sd-dynamic-thresholding
Set the CFG scale to 10.0
Click the checkbox "Enable Dynamic Thresholding (CFG Scale Fix)"
Set the Mimic CFG Scale to 7
If you don't want to use this plugin, then set the config scale to 5 or 6
This model is very easy to prompt, and does not require a ton of prompt engineering to get good results. The following format will yield decent results:
Prompt:
(best-quality:0.8), perfect anime illustration, <normal description of the image, e.g. a woman running in tokyo at night, a flaming meteor, etc.>
Negative:
(worst quality:0.8), verybadimagenegative_v1.3, (surreal:0.8), (modernism:0.8), (art deco:0.8), (art nouveau:0.8)
The model is capable of NSFW
Description
The new version was trained against a photorealistic dataset, then this trained version was subtracted against v2 using block merging to effectively invert the training - amplifying the cartoony parts and de-emphasizing the photoreal parts. The result is better performance at lower CFG scales, and a style that corrects the slight over-correction towards realistic in the previous version. The results are much more like what I had originally intended v2 to be - cleaner lines than v1, better colors at lower CFG scales (CFG of 6 will work quite well now if you don't have the dynamic threshold scaling plugin), and a more consistently hand-drawn look.
FAQ
Comments (29)
This model is amazing for create the base composition. I will check the v3 for sure
VERSATILE ASF !!!!!!!!!!!!! thank you!!!
Can we get a pruned version of V3?....
Yup! Just uploaded it, should be 1.99GB
@bigbeanboiler Thanks!
great! love this kind of style
Does this only do females? No males or animals?
I like this style a lot. Is it okay to ask what this was trained on?
Also is there a specific vae this can be used with?
The training process for the model was a bit unconventional - I trained the model against a collection of ~400 random photos and then merged the model against itself (version 2, specifically), subtracting the photoreal weights out and amplifying the non-photoreal weights. The results are a more toon-like style than v2 while not actually needing any hand-illustrated training data. Version 1 and Version 2 of this model were conventional merges, so I'm unsure what they would've been trained against.
As for VAEs, I'd recommend using the standard vae-ft-mse-84000 as that's what the model was trained with.
Seems like V3 is more likely to produce humans, while V2 works with non-human subjects
Hi! I just wanted to thank you for publishing this model, it is sooo pretty!! Just one question, do you recommend clip skip 1 or 2? I'm seeing both in the community examples.
I use clip skip 1 myself, though I haven't messed around at all with clip skip 2 so it's possible it could work well
Excellent Model myself I still prefer v2 something I've got lost in the lighting in V3, It has more accuracy but less dramatic images
Will you make an XL version in the future?
I'd like to, but the architecture is quite a bit different than SD1.5 so it'll take a while to get the tooling in place and learn the quirks of the system enough to train a new model
What comprehensively complex concoction of kooky narcotics did you consume in order to construct some of those prompts in the preview images? Sharing is caring, just saiyan.
This is my favorite model by far, 11/10. But it seems to produce more clones/twins/floating heads in the strangest places than most models. I get that the higher resolutions cause that, but this seems to work best on those higher res. Any tips to negate it as best you can?
Oh lol I used a random prompt generator so that I wouldn't have to come up with a wide variety of sample images myself. As for the floating head and twin issue, unfortunately I'm not really sure what the reason for it is. Best I can offer would be to reduce the hires fix denoising strength. I'd like to resolve these issues in v4, but I'm still working on getting a good training dataset together for it.
Do you have any plans for SDXL?
This is my favorite model, I absolutely love it. <3
I wanted to ask if you can kindly release an inpainting-aware model of this - they're specifically trained with additional unet channels and an understanding of masking?
For sure, I'll work on getting out an inpainting version somewhat soon
@bigbeanboiler Thanks for the reply, this model is really the best 2D model I've come across. An inpainting version would just complete the package for the compositing workflow.
Great work! Thank you for sharing.
hi,I saw three models when downloading,what are the differences between three models of different sizes?
the 1.99gb is the pruned version of the model
Hello! I was just curious if theres a way to get this model working correctly on comfyui? thanks!
Hey, I really like what you have done with this model and I would like to discuss a bit about it on Discord, would you be down for that?
Sure, my discord username is the same as my civit username
Looks great!
diffusers weights if anyone needs them: NickKolok/flat2DanimergeV20



















