I. Introduction
AnimaYume is a text-to-image model fine-tuned from Anima, a high-quality anime-style image generation model developed by CircleStone Labs. It builds upon Cosmos 2, a model developed by NVIDIA’s research team.
II. Information
For version 0.1:
This model is a preview version fine-tuned from the Anima base model using a custom dataset. Training was conducted across multiple resolutions ranging from 768 to 1280 pixels, with a primary focus around 1024. The goal of this release is to improve stability and minimize unwanted artifacts when producing high-resolution images.
Notes: All the example images at this version were generated at the resolution 1024x1536 or 1536x1024
For version 0.2:
This model is a continuation of AnimeYume v0.1. In this version, I improved the quality of my dataset and used several techniques to prevent oversaturation and low-quality outputs. Based on my testing phase, I observed that the prompt coherence is better than v0.1, and the model remains very stable when generating images at a resolution of 1536.
Note: I am still waiting for the final version of Anima and testing some methods to make my training process faster. I know the license might make the model less popular, but I only care about whether the model is good or not. I’m aware that many others use better licenses, but I’m too lazy to spend a bunch of money training a model from scratch.
For version 0.25:
This version was trained on Anima Preview 2. Due to several issues with the base model, such as overfitting, black/white borders, quality inconsistencies, and problems with artist tags, I decided to focus primarily on improving the model’s knowledge, reducing these issues, and making it as stable as possible.
Note: In this version, I did not attempt to improve the model’s style. I tried doing so, but it caused the model to forget some of its existing knowledge. The training process is similar to v0.2, but the dataset has been adjusted to better address the issues present in Anima Preview 2.
For version 0.3:
This version was trained using Anima Preview 2. It is an experiment with a new training method for the model. You can consider it as another branch of AnimeYume 0.25, developed in parallel. However, this version uses new techniques and a larger dataset compared to v0.25.
Note: In this version, I experimented with a new training approach, so the model is slightly different from v0.25. Additionally, all example images were generated using prompts shared with users on CivitAI to evaluate whether this new method.
For version 0.4:
This version was trained on Anima Preview 3 using a custom dataset. In this release, I improved prompt understanding and artist style. Based on my testing, some artist styles match my expectations, although I haven’t tested everything in detail since I’m currently quite busy :<. Additionally, I fixed several issues from Anima Preview 3 that also appeared in Preview 2.
Note: I’ve only tested with simple test cases, not comprehensively, so if you encounter any issues, feel free to let me know. I also used a larger AI computing cluster to speed up the training process :D.
All example images were generated using prompts shared by users on CivitAI, as I wanted to evaluate the model’s performance.
For version 0.5:
This version was trained on Anima Base v1.0 using my custom dataset (a mix of a small e621 dataset and Danbooru). In this release, I added many new characters and improved the existing ones. I also enhanced support for various artist styles, allowing the model to generate results that are much closer to the original styles. In addition, the model now understands some concepts and knowledge from e621, although the support is still limited.
Notes: I’ve only tested the model with a few simple test cases so far, so if you encounter any issues, feel free to let me know. This release can be considered a demo version showcasing my new training method, which focuses on preserving existing knowledge while adding new knowledge at the same time. The release also came sooner because I was finally able to use all the resources I had available :D
All example images were generated using prompts shared by users on CivitAI, as I wanted to evaluate the model’s performance using real user prompts.
III. File Information
This file contains only the diffusion model and does not include a VAE or text encoder. To use it properly, you will need to download those components from the link here
IV. Notes & Feedback
This is an experimental fine-tuned release, and I am waiting for the final version release to tune it :D
Your feedback, suggestions, and creative prompt ideas are always welcome, every contribution helps make this model even better!
V. Acknowledgments
Big thanks to narugo1992 for the dataset contributions.
Credit to Circlestone Labs and Nvidia for the fantastic base model architecture.
If you'd like to support my work, you can do so through Ko-fi!
Description
FAQ
Comments (77)
Hi, I’d like to clarify that I trained two versions for Anima Preview 2 in parallel. This version is quite different from v0.25, as it uses an experimental training method aimed at making the model more stable and improving its ability to understand a wider range of concepts. I also published a post to compare between these models here is the link: https://civitai.com/posts/27467169
I would really appreciate any feedback on this version. Thanks!
The ice skating image looks quite promising. Euler A didnt work for it so good in preview2 because the icy dust would have a very grainy texture with the grain size being very visible detached dots. AnimaYume 0.3 seems to make a nice dusty texture even with Euler A (in preview2 I had to use an SDE sampler to make it pop), so lots of improved detail while understanding the concept with 0.3. Nice!
Wow I just looked a this like an hour ago, and now suddenly v3 is out. Awesome!
V3 is definitely a lot better at concept understanding I think, far from where I'd like it with some things still
needs more testing but I think edges for v3 are looking kind of pixelated with some artists compared to v0.25 and preview2
Try removing all references to resolution from your prompt e.g highres, absurdres, lowres etc.
@Gyer I wasn't using those to begin with. I noticed that it's not as prevalent if you do natural prompting instead though
Is there anyway to minimize the white/black backgrounds that get generated when prompting?
All my v0.3 testing was done with Preview2 trained Loras (examples in gallery):
General thoughts - v0.3 has better details (hands/eyes) and slightly better prompt adherence compared to v0.25
v0.3 is AMAZING with artstyle Loras. I trained a western and anime artstyle Lora on Preview2 and they both look great on v0.3 and significantly better than v0.25. To the point that I will only use v0.3 with my artstyles Loras from now on.
V0.3 also works great with character Loras, however, it can sometimes apply "too much" details which can affect the original style of a character. Using prompts like amazing quality, newest, absurdres, score_9 I noticed it caused the model to add/hallucinate unnecessary details. This is not a bad thing per say, just something to be mindful of. I would suggest using v0.3 without quality prompts first.
tl;dr v0.3 is another great checkpoint. Works especially great with arstyle Loras.
I'm using ForgeNeo. I had to disable the --xformers flag in version 0.3v; I didn't have that problem in version 0.25.
Error I got:
AcceleratorError: CUDA error: invalid argument Search for cudaErrorInvalidValue' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
With 0.3, even when I wrote the prompts in plain natural language without specifying anything, the camera angles, head orientations, and poses all ended up rigidly fixed, resulting in almost identical images every single time. 0.25 is much better in this regard and actually gives me the nice random variations I was looking for.
The dataset on these are a lot bigger than that of Illustrious models, or am I wrong?
v0.3 has better aesthetics and overall better details (smartphone has actual rear camera instead of nonsense blobs) but it lost diversity in body types. It is much harder to prompt from boring "default" body, like plump or mature, especially for known characters. v0.25 is better in that regard, almost as good at variety as base.
did it work on reforge? i get TypeError: 'NoneType' object is not iterable help me pls :(
which schedule type you're using?
I used normal
@duongve13112002 thanks!
eulerA+beta57, 30 steps, CFG 2.0-5.5 (higher CFG is more cartoonish)
@nortv88855 are you using hires fix/upscale?
whats the best upscaler for this model? Tried some but always gives some artificing
SeedVR
Any upscaler is good but if you're talking about Hires fix, you dont need to upscale. Pass the latent from the 1st KSampler to the 2nd KSampler and set steps: 15-20, denoising 0.3-0.4, sampler: euler_a/er_sde, scheduler: sgm_uniform/simple.
topaz gigapixel's cgi model
@NanahiraFan how do you upscale but not use hires fix im confused what you mean exactly
@skindentation321 they are talking about comfy. What I can recommend for forge is either keeping it x1.5 or lower and 0.3 or less denoise, or using tiled upscale like Ultimate Upscaler.
@NanahiraFan ah didnt think of that, will try it. I usually use ultimate upscaler with various model like remacri, esrgan etc but the result isn't really good. so I'm looking for one that is great with anima model.
@Leonmitchelly you can upscale with remacri/esrgan and use USDU too. But the tile size must match with the base image dimension. If the initial image is 1024*1536 then the tile size must be 1024*1536. You can use the same sampler and scheduler like I suggested in my previous comment. Denoising 0.2-0.25, tile padding 256/512/768, and mask blur 32.
@skindentation321 you cant do hires with Anima using the same method that you did with Illustrious/SDXL models. Both models have different archs and different vae so they process latents differently. You'll get grains and leftover noises if you use the Illustrious/SDXL hires method. USDU is the best method to refine upscaled images generated by Anima but the upscaled image must be 2x or 4x the base image size and tile size must match the base image dimension.
@InvictusAI I haven't tried SeedVR yet but does it work well on anime picture? I've only seen people use it on realistic picture.
@Kaze111 Yes it does quite alright with anime. It's miles better than GAN upscale (like ultrasharp), but not as good as Ultimate SD upscale in SDXL.
so far its a bit unstable but usable,some artist styles or characters very known the checkpoint doesn't understand,and other less known it does,very ''gacha'' checkpoint,but i like the idea of texts,i hope this becomes better than illu/noobai...
Hi everyone, I’ve got some good news and some not-so-good news to share.
Good news: I’ve figured out how to fix the jagged (aliasing) issue in the images generated by the Anima model.
Bad news: The fix will be included in the next version of Anima, not the current one. I’d prefer to save my resources and focus on improving the upcoming release.
The issue isn’t related to the training data, so no worries there it mostly comes from some suboptimal settings on my side 😄
A bit more about me: I’m not part of any organization or team, just an individual exploring and learning how diffusion models work. I’ve trained various models, including my own custom setups (for example, combining Qwen 3.5 with a modified Flux Klein architecture as an image encoder, along with Flux VAE .2). Most of what I do is driven by personal interest and experimentation.
My model is free to use, and you’re welcome to do anything with it as long as it follows the original license. I don’t expect support honestly, just knowing that you like my model already means a lot to me 😄 For me, spending money on this is just part of learning and going deeper into these models.
I love you duongve
You're the reason why I moved on from Illustrious to Anima. I appreciate your work.
W
thank you
Was about to ask about this. Love your model ❤️
I log in several times every day to check if there's a new version of this wonderful model that fixes the problem with the jagged issue, may God grant us patience and the author the strength until new version release :3
@duongve13112002 Anima Preview 3 just released, trained more at 1024x1024. Maybe consider trying it to see if it's worth using as a new base?
Version 0.3 is absolutely amazing! It has saved my LoRa. Fully exploited the potential of this clip
A lot of artists' tags can be recognized; there's no need to use Lora. The quality is very good!
It's a great model, probably v 0.3 is currently my favourite anima model, the only issue is those weird jagged lines around the edges that make the images look like a game without antialising, but in terms of prompt adherence,understanding, style and composition the model does an amazing job.
Seems super cool. A LOT better than v0.2 for prompt adhering. My only issue is it loves to censor the penis/pussy a lot of the time. Even when adding nsfw to the prompt or (censor, mosaic censoring) to the negative.
Just realised this may be more due to the fact I train on anima official and it's that causing the issue, even though I generate on anima yume
I had the same issue but it was easily solved by adding "uncensored" in the positive prompt
@NanahiraFan OMG actual life saver. Seems the NSFW has absolutely no affect but uncensored works in its place. Cheers
Was testing and I really recommend trying this model with Res Multistep (sampler) if you have the option available. It generates very detailed images when compared to the same seed on Euler.
EDIT: Dont use Res Multistep Ancestral, it makes a mess
v03 of anima was released and this works on it too,thanks!my prompt quality improved a lot,just a question,which schedule type i use to combine?
@zakotsuko sgm_uniform
Thanks for the tips! v0.3 jagged lines can be neutralized with Res Multistep (sampler) + sgm uniform
I really recommend RES2M+bong_tangent
20step is better Res Multistep 30 step
@zakotsuko Schedule on simple, normal or beta. I honestly don't have a preference.
@NTR_BLACK I installed it and will give it an honest review. Comparing it with ResMultistep+normal @20steps with 5CFG, I found that RES-2m+bong_tangent creates a completely different style (same character, same setting, just different style). When comparing Resmultistep with res2m and Euler, I found that increased detail only happens with multistep as Res2m and euler are closer to each other in detail. Furthermore comparing Multistep to Euler is like adding an extra detail layer on top, whereas res2m changed style, pose, direction and (in this example) food on the table. Summarized as (checked ~20sets of 3 images on 20 steps 5CFG and bong_tangent):
ResMultistep +detail, -promptadherence (slightly) |
res2m +image variation, -small detail.
Try a prompt with wrinkled clothing and compare both.
Best detail+prompt: base Res-2M+Bong_tangent, then Refine with Res Multistep (same schedule).
@danque this is a little hard, would you be able to share a general workflow for anima with upscale and common things like facedetailer? would mean a lot, thank you!
@danque i had use a rescale CFG,It may also have an impact.
@msiaigens I have no workflow since I switched over to SwarmUI. But you can use the regular SDXL templates from comfyui itself and change the model requirements to match Anima's, there are some with face detail. My comment is only focused on changing the Sampler to Res Multistep ;), but I found Res2M recommendation from @NTR_BLACK Also quite good.
@danque alright thanks for the info, first time i hear about swarmui, might have to check it out, if its not a bother what res multistep or res2m do or how to get em? cant seem to find them. Thanks!
@NTR_BLACK if you check my results on anima official,you will see that my results are very good with res multistep sampler+ sgm uniform
@zakotsuko Hey there, would you mind telling me how to use res multistep¿ right now i select res_2m and sgm uniform, do i have to change sampler and scheduler? what sampler is res multistep? THanks!
@msiaigens first,i'm using forge neo,second,if you just save one of my images and put in png info,and send to txt2img,will copy my configs ^^
@zakotsuko yeah probably different since im on comfy? i did drop the img directly into comfy wich loaded the entire workflow of it and its different, but probably just the difference between comfy and neo forge
@msiaigens Honestly I think an AI might explain it better than i can do. however if you want the technical part: https://www.preprints.org/manuscript/202503.1432
Now where to get, its should already be in your comfyui sampler setup (update if not), for swarmui under sampling.
@danque its weird, i select the res_2m sampler (wich is the multistep, correct me if im wrong) but i dont see a notable increase in quality or adherence, in fact for me is slightly worse or straight up worse than euler a o euler a cfg++ , dont know if this is normal
@msiaigens res_2m is not the same as Res Multistep with Normal as scheduler. Also Res_2m needs to be used with Bong_tangent as scheduler to work properly.
Patiently waiting for Preview 3
I'm impatiently waiting >:O
*waits faster*
WHERE IS HE? WHERE IS OMNIMAN
I am training it :<. The preview 3 released 2 days ago, it is not enough time for me to tunning model :<
Does anyone have a workflow for anima with high-res fix and face detailer, would appreciate it a lot!
@msiaigens I'm using this: https://civitai.com/models/2426853?modelVersionId=2789988
For upscale I'd rather use this: https://civitai.com/models/2478484/anima-tiled-segs-upscale?modelVersionId=2786588 - only need to add detailers, which is easy to do.
I use that because Anima still has some issues with a higher res fix without a thing like this
Personally, I am using Ultimate SD Upscaler and works wonderfully. Just use a 0.1 denoise and 10 steps. It fixes most things automatically.



















