Aesthetic Quality Modifiers - Masterpiece
Training data is a subset of all my manually rated datasets with the quality/aesthetic modifiers, including only the masterpiece tagged images.
ℹ️ LoRA work best when applied to the base models on which they are trained. Please read the About This Version on the appropriate base models, trigger usage, and workflow/training information.
Version 5.0 [anima-preview-3] (Latest)
(Temporarily including here as the "About This Version" section is having issues)
Trained on Anima Preview-3-base
Assume that any lora trained on the preview version won't work well on the final version.
Recommended prompt structure:
Positive prompt (quality tags at the start of prompt):
masterpiece, best quality, very aesthetic, {{tags}}, {{natural language}}Updated dataset of 386 images, all masterpiece tagged images trained in Kirazuri (Anima) model version 2 dataset.
Trained at 1024 x 1024, 1280 x 1280, and 1536 x 1024 resolutions.
Previews are mostly generated at 1536 x 1024 or 1024 x 1536 .
Training config:
diffusion-pipe commit b0aa4f1e03169f3280c8518d37570a448420f8be
# dataset-anima.toml
resolutions = [1024, 1280, 1536]
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 9
# Totals
# 386 images
# 15504 samples/epoch
# 153 images
# 48 samples/image - 7344 samples/epoch
[[directory]]
path = '/mnt/d/training_data/0_masterpieces_kirazuri/1536x1536'
repeats = 16
resolutions = [1024, 1280, 1536]
# 44 images
# 48 samples/image - 2112 samples/epoch
[[directory]]
path = '/mnt/d/training_data/0_masterpieces_kirazuri/1280x1280'
repeats = 24
resolutions = [1024, 1280]
# 189 images
# 32 samples/image - 6048 samples/epoch
[[directory]]
path = '/mnt/d/training_data/0_masterpieces_kirazuri/1024x1024'
repeats = 32
resolutions = [1024]
# anima-lora.toml
output_dir = '/mnt/d/anima/training_output/masterpieces-v5'
dataset = 'dataset-anima.toml'
# training settings
epochs = 5
# Per-resolution batch sizes
micro_batch_size_per_gpu = [[1024, 32], [1280, 24], [1536, 16]]
pipeline_stages = 1
gradient_accumulation_steps = 1
gradient_clipping = 1
warmup_steps = 100
lr_scheduler = 'cosine'
# misc settings
save_every_n_epochs = 1
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
map_num_proc = 8
steps_per_print = 1
compile = true
[model]
type = 'anima'
transformer_path = '/mnt/c/workspace/models/diffusion_models/anima-preview3-base.safetensors'
vae_path = '/mnt/c/workspace/models/vae/qwen_image_vae.safetensors'
llm_path = '/mnt/c/workspace/models/text_encoders/qwen_3_06b_base.safetensors'
dtype = 'bfloat16'
llm_adapter_lr = 1e-6
flux_shift = true
multiscale_loss_weight = 0.5
sigmoid_scale = 1.3
[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'
[optimizer]
type = 'adamw_optimi'
lr = 4e-5
betas = [0.9, 0.99]
weight_decay = 0.01
eps = 1e-8Description
FAQ
Comments (22)
does it work like a regular lora? (anima version)
I would love to see a Klein 9b version of this one <3 !
Which quality modifiers in your collection is most recommended? masterpiece, best quality or complete?
For narrow applications like simply trying to create a more aesthetically pleasing image this "masterpeices" version would probably be preferable due to the smaller dataset size
Excellent model! I've also tried training some style LoRAs on anima and experimented with mixing NL and Booru tags, but in my tests, I haven't seen any significant improvement compared to using pure Booru tags for labeling. Could you please share what aspects of the image you primarily use NL to supplement annotations for, in order to achieve better results? My current strategy is to use NL to describe the overall composition and lighting effects, leaving the subject description to Booru tags.
Hello, thank you!
NL captioning workflows can be quite complicated I am learning, but my attempts so far involve having a VLM use the tags as grounding.
I try to have it name a character, then describe their copyright, artist and basic appearance - roughly similar to how inference prompting in NL is recommended for the model.
I also use a different system prompt for single or multiple character scenes to try to prevent it misattributing names.
I've shared my small script for reference here:
https://github.com/motimalu/diffusion-workflows/blob/main/nl-captioning/label-large.py
pls upload it on seaart
I've noticed that the Aesthetic Quality Lora (Anima ver.) sometimes seems to override other prompts. It looks like the model prioritizes the Lora's aesthetic goals, leading to less accurate following of specific instructions.
Following up on my previous message about the Aesthetic Quality Lora—I wanted to provide a concrete example.
When using this Lora, the model sometimes modifies specific actions on its own. For instance, if the prompt specifies "hands on ground," the output might show a pose that's close but not exactly what was requested (like one hand on ground, the other relaxed), as if the Lora's aesthetic preferences override the exact instructions.
Without the Aesthetic Lora, the base Anima model follows prompts very accurately. So I think the issue definitely seems Lora-related rather than model-related.
Prompt adherence was likely mostly affected here by training of the llm adaptor, which can now be disabled with the training I script used.
You may be interested in this thread on the matter for Anima: https://huggingface.co/circlestone-labs/Anima/discussions/60#6998b3f592a56ca2caee79fd
@motimalu Imo the prompt adherence is better when I use your lora.
The description for Anima might be misleading. After all, quality tags after the main ones are relevant for ILXL and Noob, but for Anima, they should be at the beginning
Hello, thanks for pointing that out, yes optimal settings will differ by base model.
I've moved these recommendations to the "About this version" section of the model card.
u got an anima training guide ?
Hello, I'm not sure its worth investing time in writing a guide for the preview version at this point (hopefully the full release is soon) but you can reference the training software and configuration used in the "About this version" section of the model card.
Did you train the anima lora at 1mp or would you say it would work great together with the Anima Yume finetuned checkpoint that aims for 1536 x 1024 / 1024 x 1536?
Hi, yes trained at 1024 x 1024 on Anima Preview
So might expect a weaker effect with Anima Yume because it both diverges from the base and targets different resolutions
@motimalu Thanks, and will you train the lora on anima preview 2 too ?
@deitychaser Yes testing a few different datasets on anima preview 2 now
@motimalu Very cool, its very useful. The main benefit why I use it isn't even necessarily the quality and detail boost (but its also a factor), but especially that it allows me to get my character in the scene back into a bigger frame because if I use alot of natlang for effects and styles and specifics about the scenery then the character tends to get lost in a sort of wide shot scenery and anima tends to ignore any close-up prompts. With the aesthetic quality modifier this get resolved and I can keep my natural language descriptions and get the image composition I want. So I assume your previous dataset has lots of closer character focused shots. If possible then keep this bias!
@deitychaser Thank you, I've released a version for anima preview 2 that uses an only slightly updated dataset now
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.



















