Zonkey V7 - Download at your own risk edition.
Trained another slider on base Pony with 1.5MP buckets instead of 1.0MP buckets and applied it to Zonkey 6, . Also applied NTC's Not Simple Background LoRA.
Model likes low CFG.
I found that using score tags in an initial prompt, and performing a restart of most steps using a prompt without them, can produce some nice results, in exchange for slightly less sharpness of detail. Examples in the images with embedded ComfyUI workflows.
Zonkey V6.2 - Hyper 8 step edition
Incorporated the ByteDance Hyper 8 Step LoRA, and NTC's Not Simple Background LoRA, for better backgrounds. I recommend Euler a for first pass, with a CFG of 1.25-1.5, and DPM2++ SDE, or similar for upscaling. Upscaling is recommended. For 2 character scenes, 12 steps can help sort out character details but 8 steps is generally fine for 1 character/pov scenes.
Zonkey V6 - Save the day edition - Now with 50% less sameface
Decided to be less lazy, and do some training with v6. I injected 25% of Pony back into Zonkey v5. I inpainted the faces on 400 photos, 200 close, and 200 far, with this model and used the resulting pairs to make a 2 slider loras, to make things look less pony-ish. The loras were then merged in at different layers.
Zonkey V5 - Baked from scratch edition
Started with all the ingredients of Zonkey V3 and DARE merged them all in random order at 6% each for 50 iterations. Added 12.5% halcyonSDXL_v17 and 5% bemypony_real. Back to original Pony CLIP, so it should train LoRAs better than v4. Brought the brightness back up, for a more neutral style, and reduced the noise from 4.2.
I recommend either not using score tags with this version, running them for an early fraction of steps, or placing them near the end of the prompt. They can help with composition, but they nudge the image towards anime style. 1girl has a similar effect, so leave it out if you don't need it. The hashed token negative I used with previous versions tends to break things now, so I've stopped using it as well.
Zonkey V4.2 - Double Check Your Work edition
The extra noisiness of 4.0 was due to an incorrect setting on one of my CLIP DARE merge nodes. Fixed that, and snuck in a little extra oneFORALLPonyFantasy_v20DPO into the CLIP. 4.2 improves clarity, and a standard number of steps can be used again. All example images(except for the first one) use the same generation data as the 4.0 images, except using 30 steps for both first pass and Hires fix, instead of 40-100. They aren't cherry picked, so there are a few glitches I normally would have left out, but I wanted to show a direct comparison.
Zonkey V4 - Beat to the Punch edition.
Was gonna make a model called Godiva, but the name got scooped. Oh well, this model kicks ass, whatever it's called. Very little character accuracy was sacrificed for the photorealism. The CLIP is now modified. I replaced ~50% ofthe original Pony CLIP with oneFORALLPonyFantasy_v20DPO's CLIP and another 25% with proteusRundiffusionDPO_truereversecubich. Added ~40% proteusRundiffusionDPO_truereversecubich, ~10% datassRev3Pony_rev3, and ~5% damnPonyxlRealistic_damnV20EXTREME to the UNET, V Gradient merged it back with Zonkey V3 and put ~25% enjoyXLSuperRealistic_v30ModifiedVersion in the out layer.
This model is stuffed with creativity. It can be a little noisy, use Euler a, for the Hi-Res fix, instead of a DPM++ sampler if it's too much, I think I found a good balance between noise and creativity. It likes lots of steps, I'd suggest at least 30-40, and Hi-res fix, but overall, the anatomy is more reliable than any previous version, so you shouldn't have as many wasted generations if you give it those extra steps.
Enjoy and get wierd with it. I like seeing all the things that get made with it, whatever it is, so don't be shy about posting.
Zonkey V3 - Mad Surgery edition.
Zonkey V2 got a DARE injection of a CosXL merge of Copax Timeless, Rundiffusion Proteus, Art Universe, Realistic Stock Photo, and People Photography, add difference, paired with DARE removal of Jibmix, AnimeBoys, ToonSphere3D and Animagine. Then it was all DARE merged back into original PonyDiffusion V6 XL.
V3 is more versatile than previous versions, capable of achieving both higher photorealism, and more toon-like styles, but it has to be prompted for. Using the terms "real", or "real life" near the beginning of the prompt, without score tags, can help achieve high levels of photorealism, with better faces, especially when weighted fairly heavily. Using score tags can improve overall aesthetics, but will reduce photorealism, and give faces more anime/cartoon-like shape.
Along with improvements to contrast and color range, Zonkey V3 includes the Blessed VAE to amplify the effect.
Zonkey V2 - Big Hairy Juggs edition.
New Masked DARE injection of Juggernaught X, SDXXXL, Bordello and Ratatoskr. U-Net renormalization relative to PonyDiffusionV6 XL.
Improved photorealism
Better lighting
Sharper detail
Higher background variety and detail
More vivid colors
Fewer artifacts
Better anatomy
Higher character accuracy
Furrier furries
Zonkey was created with the goal of bringing as much photorealism as possible to a Pony model, while attempting to retain its flexibility and prompting power. Like other Pony realism models, it can have problems with eyes when faces are relatively small, but Hires fix often improves them.
All posted images are DPM++ SDE Exponential, 30 steps, CFG 3.5-5.0, Hires Fix Latent(bicubic antialiased), 1.5 or 2x. It can be easier to get better poses with Euler a and DPM++ 2S a, but the detail isn't as high. No ADetailer was used, but it may be helpful in some cases. See posted images for suggested prompting style.
The following checkpoints were used:
Animagine XL v3.1
Art Universe SDXL v2.0
Bordello v1.6
CinematicRedmond v1.0
ChacolRealPonyMixXL (Asian Version) v3.0a
Copax TimeLessXL v11
FULLY_REAL_XL v9.0
Juggernaut XL v8
Pony Diffusion V6 XL
Ratatoskr v3.8
Realistic Freedom Wonderland
RealVisXL v4.0
RunBull_XL v0.4
RsmPornXL v0.81
SDXXXL v3.0
Virile XL v1.0
yudas_woman v3
ZavyChromaXL v6.0
Along with the following Loras:
BoringReality_faces v4
Porn Productivity(multi-concepts) PP-21 v1
RMSDXL Photo XL v1.0
SDXL Offset Example Lora v1.0
Styles for Pony Diffusion V6 XL Photo v2
The Handsomizer v1.0(Pony Diffusion V6)
Description
FAQ
Comments (23)
It looks great! Can't wait to try it out. I am noticing your prompts, especially the neg prompts. I have no idea what those mean and when do they need to be modified/changed. Could you go over what those are, and what should be the optimal prompt for this model?
同问~the same question
Those three letter tags are hashed tokens that produce nonsense images, and bad cgi. In the positive of Zonkey, in aggregate, they will make a flat shaded anthro lizard on a simple background, so don't use them if that's what your going for. https://rentry.org/ponyxl_loras_n_stuff#reverse-engineered-hashed-token
I didn't put together that set of them myself, I copied that part of the negative from some images in one of the other Pony realism mixes/loras, though I'm having trouble finding the source again now. It's subtle, but it makes the images look less artificial to me. Each one alone doesn't do much so I wouldn't worry about removing some, or all of them, if you have other important negatives.
As far as optimal model prompts go. I often emphasize photographic and lighting terms first. I weight them pretty heavy.
I use score_8_up instead of score_9, because in base Pony score_9 produces more paint-like/masterpiece results where as score_8_up tends to produce more high quality 3d results, which is the closest to photorealism.
I use danbooru tags with underscores wherever they exist(except for cases where a multi-word concept that has a danbooru tag has a lot of context that could be found in base sdxl 1.0) and spaces otherwise.
You may need to weight some concepts heavier in this model, than in base Pony. This model only modifies the U-Net, and the CLIP is unaltered from base Pony, so I didn't introduce any totally new concepts from models I merged with.
I have a bunch of 3d danbooru tags in the negative, as with my choice of positive score tag, the model can drift towards bad cgi fi they aren't there.
Having hands heavily weighted in the negative tends to put them behind objects, out of frame, or into simpler poses that are less likely to be mutated.
@bot_i_celli Amazing, thank you so much!
pony has a lot of artists/characters buit-in but the creator have hashed a lot of them, some people found out that they can be used by a combination of random letters you can find more about this here: https://github.com/6DammK9/nai-anime-pure-negative-prompt/blob/4ad12234426f310c6d267c3a419691c55466dad6/ch02/pony_sd.md
@bot_i_celli You said "I use danbooru tags with underscores wherever they exist(except for cases where a multi-word concept that has a danbooru tag has a lot of context that could be found in base sdxl 1.0)" in a comment in this thread - can you expand on where to find the tags and concepts that are in a/your model? I tried briefly to find such a list and came up with nothing. :(
Thanks, and thanks for your model - it's been pretty dang awesome with my attempts on it so far!
@novakard I don't know all the boorus that PonyDiffusionV6 XL drew upon for training data, but my go to search is https://rule34.xxx/index.php?page=tags&s=list
Type in a word with asterisks around it to act as wildcards, to find all tags containing it, then sort by descending and total count. Generally, Pony models will know a tag if it has more than ~1000 posts there, relative to how much they've been diluted from the base model.
I was told you couldn't mix SDXL and pony models... so was that not true? Or do you need to do something special in order for it to not become terrible?
Are you taking the Pony CLIP, or merging the CLIPs as well?
CLIP is completely unaltered. I started with a semi-photoreal mix that was an add-difference merge of Pony+(SDXXXL-SDXLbase), applied the Pony PhotoStyle Lora, RMSDXL, and a 256 dim Lora extracted from RealVis 4.0. I subtracted 25% of Animagine to reduce the anime style, and replaced it with ArtUniverse, but this created horrid faces that needed negative Handsomizer to fix. From there, I did several rounds of masked DARE merging.
https://github.com/54rt1n/ComfyUI-DareMerge
With each merge, I masked the 50% of weights with the greatest distance from the original Pony to keep them. This is a strange process, as it looks like it destroys the model, but you just have to keep going. Then I added back the 50% of weights from my original semi-photoreal mix that were the closest to Pony, and it fixed it. I did that twice, each with a different subsets of the checkpoints listed, and merged the results back together 67/33 to get the final mix.
Those ComfyUI nodes have a bug where you need to update the model in memory every time you run it, or it will start producing unpredictable results. I forced this by simple merging every model with itself, by a random ratio each time.
@bot_i_celli I think you've created an amazing base. I need to do further testing, but this could leap us forward with SDXL/Pony merges. It's really good so far.
my friend, you deserve a medal! I really didn't think a pony based merge that actually generate good realistic results. you really achieved a great model. The model also has good adaptability, for those who know how to play with the prompts and the CFG scale.
That wall hindering Pony-based realism just got conquered here.
wth?
(gpo, aca, aer, api, fla, gcx, hll, hnj, gpc, fii, fey, fbv, evg, iew, ifl, igh, iwj, iwp, ixb, ixe, ixz, jaf, jbm, jfb, jsf, jyk, kmz, ksh, kxg, kzg, lbv, zac, yle, zmj, szw, uiw, vfe, par, pdl, qdl, mbo, mtd, gor, bhz, dit, frw, fnaf, bmo, zbi:0.5)
Another comment already asked, they're hashed tokens. https://rentry.org/ponyxl_loras_n_stuff#reverse-engineered-hashed-tokens Some, produce somewhat artistic styles, but those ones in particular produce nonsense. So in the negative, they subtly improve the quality of the image. They aren't essential. You can taken them or leave them, but I think they help.
@bot_i_celli Ah i made a te-lora for that but you already merged that too now that I checked your merge list.
What is with the underscores? Pony was trained with filtering out of underscores. The special tokens are an exception cause they are special tokens but you don't even use the explicit one correctly. It is rating_explicit and even if it was rating:explicit you would have to escape the ':' with backslash '\:' for it to count as a character in prompt and not as a weights/timestep modifier.
Not according to the details from the Pony page: Pony Diffusion V6 XL - V6 (start with this one) | Stable Diffusion Checkpoint | Civitai
@galaxytimemachine
I can assure you only the source and rating tags are the only things meant to use underscores. The preview prompts provided by the author don't use underscores at any point aside from the source and rating tags.
Searching on their own server they even state spaces should be used.
AstraliteHeart — 03/07/2023 9:58 PM
you should put spaces instead of underscores
AstraliteHeart — 03/07/2023 9:58 PM
but underscores generally work
Truly an incredible model. Makes any lora character realistic.
Every time I try to use the recommended hires fix seeing it gives me blurred image that I can't decipher
This model is amazing!
This is incredible! I do wish that it made better locations though in future updates.
Details
Available On (2 platforms)
Same model published on other platforms. May have additional downloads or version variants.









