Intro
I made it just for fun and as experiment to build a model good for augmenting professional photographs. I am using Nikon camera with bunch of vintage lens. I expect to build an SD model which is able to produce moody, cinematic pictures with nice smooth bokeh and "analog style". Please note, that I don't plan to train this model on any hardcore nsfw. Don't expect / request it from "cinero" models ;) My preference is art, beauty and emotions.
Some tips on Prompting
Few examples:
"[grayscale : [dimmed colors : vibrant color splashes : 16] : 8]" - I call it "temporal trick". What it does is just make your prompt depending on current step. With this prompts SD will use "grayscale" on steps 1..7. SD will use "dimmed colors" on steps 8..15. On further steps SD will use "vibrant color splashes". I believe that there is no strict limits on nesting level. What you can do with it? You can effectively reduce the number of tokens SD process on each step (reduce the length of active prompt). On the first steps there is no sens to specify fine details. You only need to specify the scenery roughly. On the later steps there is no sense to spend tokens on describing the composition and lighting (I suspect). So, in theory, with this trick and big number of steps you can keep your prompt short and have build very rich prompt at the same time.
PS: this prompt above force the SD to draw the scene with very little colors and super vibrant segments (I showed grayscale images where a subject has few vibrant hair curls or clothes parts). Probably, you can reverse this effect by making whole picture colored with some parts made grayscale.[Audrey Hepburn : Milla Jovovich : 16] - you can have fun with smooth transition from one face to another with XYZ plot script in Automatic1111. Also, this particular temporal trick with face / body helps my model to render most realistic and correct anatomy. I suspect you can also implement dynamic LoRa weighting with this trick. If LoRa don't have a trigger word you can just put the LoRa token like [ <lora: ...:0.42> : <lora: ...:0.99> : 16] or you can use multiple levels of nested "trigger words" from different loras.
"shot on %Brand Name% %Lens Mark Name% vintage lens" - if you find the vintage lens names which SD have in its memory, then you have a chance to improve an "analog style" of your picture. I used to use "Carl Zeiss Sonar", "Nokton", "Helios 44-2", but cannot confirm that each particular lens model gives unique effect. If you have you own list of confirmed lens models, then please consider to share it with community in comments to this model [%PICTURE OF LEELOO saying HELP%]
In near future I plan to build a training dataset with many images shot on beautiful vintage lenses to bring old-school photography soul into this model. I will use some unique trigger word for that or will use a "vintage lens" (not sure yet).
use "perfect anatomy", "anatomically correct body", "anatomically correct hand", "perfect hands", "anatomically correct fingers", "perfect limbs anatomy" and similar anatomical phrases to increase the chance to get correct anatomy.
Us words smooth bokeh, swirly bokeh, depth of field, smooth background to increase the separation of main subject and scenery.
Use "turbulent fog", "mist" and "haze" with "mystical lighting" to get nice atmospheric picture with super noticeable depth of scene. Also use "early morning" and "blue hour" phrases if you want to get cold morning vibes.
Use "scary face expression", "surprised expression", "inviting expression", "lustful face", etc to increase the chance to get noticeable emotions on face and visible "body language". It works, but not yet very noticeable.
Priorities of this model
Cinematic photo-realistic pictures of female character (sfw, softcore nsfw)
Natural body, skin texture, [to be improved] environment (dirt, dust, stuff on floor, retro furniture and devices)
Realistic optical / photo effects (smooth swirly bokeh, analog film grain, aberrations [in progress]) of vintage lenses (Carl Zeiss Sonar, Jupiter 37a, Helios 44-2)
[To be improved] Urbex, abandoned, decaying interiors, depressive vibes, dimmed colors, fog, mist, vapor
How it was created
It is based on few merges of Analog Madness, URPM, Cyber Realistic, epiCRealism, ICBINP, Cine Diffusion with coefficients in 0.18..0.35.
It was trained with two datasets of carefully selected art photos with similar features (cinematic mood, atmospheric, charming anatomy, soft core / ero, retro interiors, morning outdoors, etc.). Total number of images in datasets: 600-700.
Trained as LoRa with 20 steps per image using Kohya_SS then merged with coefficient ~0.3 into Merge of mentioned Checkpoints. Better to use with my LoRa with the same name to amplify the effect.
Further improvements
By priority:
[done] Fix / Improve hand and fingers generation
[in progress] Improve gloom, bokeh, chromatic aberrations, spherical aberrations, light leaks and old analog film features
Fix / Improve feet and toes generation
[in progress] Add more urbex, abandoned, vandalized interiors and lost / forgotten outdoor scenery (suggest me good datasets pls ;)
Fine tuning / improvements of eyes and anatomy
Feedback appreciated...
Description
Hands improvements focused version.
Many stages of training and merge used in order to make hands and fingers more stable.
Tested with CFG 3.3 or less. Higher CFGs can produce too high contrast and exposure. Also, high CFG can produce anatomy mutations and other structural instabilities (if you don't use ControlNet).
FAQ
Comments (13)
stable diffusion v1.5 model from someone understand photography , no i can't wait i will download it now.
BOKEH, photolover!!! DO U LIKE IT as I do???
[by the voice of Samuel L. Jackson] :D
Unfortunately, this version didn't convince me. Apparently it only works with ComfyUI. I used A1111.
Owww... I didn't test it with A1111. But I used it with A1111 conditioning mode. Thought it will work OK in A1111 webui.
Can you provide what is wrong with this model in A1111?
Will try to fix it (fix the model or suggest some prompt changes).
Could you please add more sci-fi and fantasy?
In general, I can... if I find resource for that (time).
It would be good if you give a links on some models posted on CivitAI or some image examples (links to Pinterest or CivitAI) so I can estimate the efforts required.
@homoludens I mean something like this — https://ru.pinterest.com/pin/167618417376088800/ it's obviously unreal scene but looks like a photo. And let this will be a starting point for magic — https://ru.pinterest.com/pin/7459155627771903/
@homoludens if I can add more wishes I'd like to ask you to tech the model "a European woman wearing a Cheongsam" https://ru.pinterest.com/search/pins/?q=cheongsam and close to it "a European woman wearing an AoDai" https://ru.pinterest.com/search/pins/?q=aodai Obviously they aren't European women :-(
@Olbanets Lovely examples! Will think about it.
One more question... Can you consider the SDXL version with these additions or SD1.5 only?
I'm asking this because I invested many efforts into fixing the hands and anatomy in CinEro XL. And it will take more time to improve the 1.5 version before adding these dress options.
PS: just in case if you don't know, it is possible to use SDXL with NVidia 1050 Ti (4G VRAM) in ComfyUI; additionally, latest drivers for NVidia for Windows introduced the "meta-device" which automatically uses the system RAM if VRAM is too small.
@homoludens XL is too slow for my 3060. It works but I don't feel OK :-(
@Olbanets noted.. I'm also using it with 3060 12G. Will try to get myself into SD1.5 after this lora completed.
@homoludens I collected a small set for the dresses mentioned above when I tried to train a model on my own. I can share it with you if you tell me how.
@Olbanets Now I can think of sci-fi and dresses... May be LoRa would be more profitable than alteration of whole CKPT?
SD15 have low capacity and I don't want the sci-fi reduced the quality of other aspects.
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.



















