CivArchive
    Akanezora - Akanezora v0.5:A
    NSFW
    Preview 132362101
    Preview 132369446
    Preview 132367786
    Preview 132375950
    Preview 132365521
    Preview 132369967
    Preview 132370391
    Preview 132366348
    Preview 132367043
    Preview 132376166
    Preview 132363593
    Preview 132373202
    Preview 132375607
    Preview 132372520
    Preview 132372238
    Preview 132373739

    Akanezora-Anima


    Akanezora is a full Anima DiT fine-tune trained entirely on a single RTX 3060 with 12GB of VRAM using Aozora. It is released both as a usable model and as proof that full Anima DiT fine-tuning can be done on consumer hardware.

    If you're interested in fine-tuning your own model, the training code is available here: [Aozora_SDXL_Trainer]


    Branch A vs B

    Branch A: follows the recommended Anima tuning setup with bell-weighted loss and non-uniform timestep sampling.
    Branch B: uses my experimental setup with uniform timestep sampling and uniform loss weighting, giving the model more even exposure across the full noise range. In my testing, this improves visual style and composition, but is slower and harder to tune.

    Version 0.55b Preview

    This is a 55% training checkpoint continued from the 0.5a checkpoint, trained on 15k images using experimental Branch B settings. The dataset consists of 50% Danbooru-tagged images with generated natural language and 50% hand-tagged / natural language-captioned images. During training, text conditioning was split into 90% tags and 10% natural language prompts.

    While this release is functional, it should be considered a work in progress rather than a finished product. It is being shared early to demonstrate the viability of the training method and to showcase the Aozora trainer’s ability to fine-tune Anima DiT models on low-VRAM hardware.

    Pros:

    • Reduces unwanted text generation by around 70%, reducing the need for heavy negative prompts.

    • More responsive to Danbooru-style tags.

    • Slightly more dynamic seed variation due to soft conditioning.

    • More SDXL-inclined generation style.

    • Better overall composition and prompt feel across varied prompts.

    • Improved NSFW output quality.

    Cons:

    • Some seeds may still closely resemble the base model.

    • Lighting effects often need to be prompted directly.

    • Still early in training and may hallucinate content.

    0.55b Training Settings

    Base Model: Akanezora V0.5a
    Training Hours: Unknown (Power went out so it took 2x longer, estimated around 50 hours)
    GPU Used: NVIDIA GeForce RTX 3060 (12 GB) | Driver version: 32.0.15.9636

    VRAM Usage: ~11.4GB

    Mixed Precision: bfloat16

    Batch Size: 1

    Gradient Accumulation: 4

    Learning Rate: 6e-6
    Timestep: Uniform
    Loss: Uniform

    Optimizer: Raven[AdamW float32 variant with offloading] | (betas:0.9, 0.999 | eps:1e-08 | Weight Decay: 0.01| Debias: 1.0)

    Max Train Steps: 201010 (Completed:115282)

    Current Checkpoint: ~55% through planned training

    Trainable Parameters: - (P: 1,956,405,248 | P Frozen: 6.44% [llm_adapter.*])

    Soft text cond: ( 0.75 - 1.25)

    Dataset Size: 15164
    Training Resolution: 1152x1152 (Aspect Ratio Bucketed: 864x1536 to 1536x864)

    VRAM saving techniques: (Momentum offloading, bfp16 mixed precision, pre-caching VAE and text encoders, Gradient Checkpointing)


    v0.50 settings: bf16 mixed precision, batch 1, grad accum 4, LR 5e-6, Raven AdamW offload optimizer, wave timestep schedule, soft text conditioning 0.75–1.25, 1152 bucketed training from 864x1536 to 1536x864, VAE/text encoder pre-cache, gradient checkpointing, and momentum offloading.

    Sampler: ER_SDE
    Scheduler: Beta

    Steps: 15-50

    CFG: 3-5

    Negative Prompt: worst quality, low quality, lowres, score_1, score_2, score_3, blurry, jpeg artifacts
    Note: You need to use qwen_3_06b_base.safetensors for text encoder, and qwen_image_vae.safetensors for VAE.


    Model Transparency Notice

    For transparency, this release includes the training setup and links back to the open-source trainer/code used to create it. This is a full fine-tune checkpoint with no LoRA, LoKR, LyCORIS, or model merge operations applied.

    This model:

    • Training started with unmodified base weights

    • Zero merge operations applied

    • No LoRA adapters — ever

    • Full end-to-end training, no sublayer freezing besides the required (llm_adapter)

    Notes

    Feedback is welcome, especially on prompt following, anatomy, hands, style consistency, repeated patterns, overfitting, and behavior without heavy negative prompts.

    License

    This model follows the license of its base model, Anima. Review and comply with the base model terms before using or redistributing.

    Description

    FAQ

    Comments (2)

    mac2492Jun 3, 2026· 1 reaction
    CivitAI

    I finally made a workflow to compare models and this was one of the few that stood out after testing all merges by top creators + full finetunes for Anima base. While the results weren't mindblowing, it hit all my test cases (base) without overfitting which is what I personally look for in a finetune. Will definitely be keeping an eye on the full version!

    Hysocs
    Author
    Jun 3, 2026· 1 reaction

    Thanks! I try to keep my fine-tunes on the lighter side. I personally prefer to avoid the overfitting that can occur when trying to force a model to constantly output a specific style.

    Training is halted so instead of 1.0 i will likely release a 0.65 mid version as im cutting the 36k something images down to 20k, im hoping this removes some of the bleed im having on some nsfw

    Checkpoint
    Anima

    Details

    Downloads
    245
    Platform
    CivitAI
    Platform Status
    Available
    Created
    5/30/2026
    Updated
    6/11/2026
    Deleted
    -

    Files

    qwen_image_vae.safetensors

    Mirrors

    HuggingFace (126 mirrors)
    ModelScope (1 mirrors)

    akanezora_V05A.safetensors

    Mirrors

    akanezora_V05A_txt.safetensors

    Mirrors

    HuggingFace (58 mirrors)