Follow me on Patreon!
SoReal! - POV
Overview
Reach your hands out for the stars! This model is the first of a series in Z-Image LORAs - aimed to bring diversity in both concepts and humanity itself to Z-Image.
Compatibility & Usage
Due to it's small size and rank, the model should have a minimal influence on the base model, further improving compatibility with other LORAs across Base/Turbo and indeed other checkpoints.
'Trigger words' aren't real - don't ask for one, just prompt normally. If you want a hand (literally), use 'a man's hand' or 'a woman's hand', which should normally get you what you want.
I'll upload a full concept list soon to show the range of concepts the model has been trained on - not confirming that the model is able to reproduce them, though.
When using Z-Image Turbo, strengths between 0.95 and 1.5 work best in my experience for V1, and 0.9 - 1.2 for V2.
Limitations
Anatomy is still rough - planning one further additional training run to try and address this for NSFW concepts but may mean a split between generalisation model (v2) and a NSFW-special model (V2-NSFW).
Future
Future iterations of this model will see stronger prompt adherence, anatomy adherence and general composition and quality through +/- reinforcement learning.
I am planning on finetuning Z-Image considerably with a model called 'SoReal!' (Or, alternatively, ZoReal!). However, I want it to be the best possible amateur finetune possible, to achieve this, I have:
1. Trained a custom quality model.
2. Trained a custom one-shot demographic model (height, weight, skin tone, ethnicity, age in years, body shape) with an average accuracy of 89% for top-confidence prediction using ConvNext-XL.
3. Finetuned wd-tagger-large-v3 on a large sample dataset of 50k hand-tagged images with human-assisted active learning.
4. Fed those tagged images (with quality, demographics and general labels) with the image metadata (incl. EXIF & Camera Metadata) to Gemini 3 Flash for generating captions.
No over-trained LORAs baked in, no dramatic loss of generalisation, just a good, all-round, NSFW-ready, finetuned model.
I am now severely limited, however, by my compute and financial situation, so if you'd like to help make SoReal!, well, so real, then you can follow me on Patreon!
Dataset & Training
Dataset of 2500 sourced from a variety of sources. Deduplication and Quality Scoring (through MANIQA) lowered the dataset to around 1400. This model was trained on a dataset of 1500 images at a batch size minimum of 10. This means
This model was trained on a dataset of 1500 images at a batch size minimum of 10. Masked loss was implemented after roughly 40,000 samples (not steps) to improve anatomy & concept adherence.
Validation loss was used with 10% of the dataset size to prevent overfitting while still maintaining strong concept adherence and generalisation.
Model was trained with AdamW through the Python adv-optm package.
Licensing
If you'd like to release a merge of this model, please contact me.
Made with <3 By BitcrushedHeart
Description
SoReal! POV - V3
Active reinforcement training to improve the general quality of output.
Improved anatomy for male/females, including genitalia and breasts.
Increase in prompt adherence.