Proteus v0.6
I'm excited to introduce Proteus v0.6, a complete rebuild of my AI image generation model. This is the first version of the rework, focusing entirely on enhancing photorealism. While it's not aiming to be state-of-the-art, I believe it's a good step forward in producing high-quality images. Please note that this is a preliminary version, and it's not the final, fully-featured checkpoint—more improvements and features will come in future updates.
Overview
Proteus v0.6 is a total rework from the ground up. In previous versions, combining different training methods and learning rates caused the model to become unstable during large-scale training. Learning from those experiences, I've retrained the model using only the photorealism aspects of the Proteus dataset.
For now, I'm calling this new training technique Multi-Perspective Fusion.
Multi-Perspective Fusion
This approach involves:
Training Multiple LoRAs and Full-Parameter Checkpoints: I trained several Low-Rank Adaptation (LoRA) modules and full-parameter checkpoints on the same dataset multiple times to capture different "perspectives" of the data.
Integrating into an Overarching Framework: These varied models are then combined within a larger framework to enhance overall performance.
I'm hoping this method will be interesting to data scientists exploring advanced training techniques.
Key Improvements in v0.6
Total Rebuild: Constructed entirely from scratch to address previous issues.
Enhanced Photorealism: Focused on producing good-quality photorealistic images.
Stable Training Process: Refined training methods to prevent the model from falling apart during large-scale training.
Preliminary Version: This is the first version of the rework; expect more features and improvements in future releases.
Limitations
No Illustrations or Anime: Currently, the model can't generate illustrations or anime-style images because it's only been trained on photorealistic data.
Not State-of-the-Art: While the model performs well, I'm not claiming it's state-of-the-art—just that it's a good starting point.
Work in Progress: This is not the final, fully-featured checkpoint. More updates are planned.
Usage
Recommended Settings
Clip Skip: 1
CFG Scale: 7
Steps: 25 - 50
Sampler: DPM++ 2M SDE
Scheduler: Karras
Resolution: 1024x1024
Versions before v0.6
Proteus's Background
Proteus serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs. In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.
Description
merged with RealCartoonXL to fix issues with inability to understand tags related to anime or cartoon styles at just a weight of 0.5% out of 100% using custom scripts with slerp like methods.
Version 0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.
FAQ
Comments (19)
Do you use clip skip 1 or 2? I have found that using 2 the images always turn out better on version 0.1.
I'm having better luck with DPM++ 2M SDE Karras at 70 steps than the 3M version. Even with the same seed, fingers will be correct whereas they're not quite with 3M. Compare these 2, with his left hand. https://imgur.com/8Ka2NLZ and https://imgur.com/N6vhDGT
70 steps sounds like a massive overkill, especially with SDE. Run a grid to see if you're actually adding any relevant details to the image. Also, try the Restart sampler with 10-15 steps.
I think you did a good job with this one. I've created an automated pipeline for myself that generates prompts through LLMs and if it's a photo, it pushes it through opendalle and if it's art, it pushes it through azazeal's voodoo which is opendalle but with a bunch of art models merged in. I feel like this 0.2 proteus adds back in the artwork again and softens up the severe photorealistic bias that was happening in opendalle probably from the JuggernautXL's influence. Good stuff.
i love your SatanXL ... i mean OpenDalle ... I mean Proteus checkpoint bro.
Absolutely the best of all models
Fantastic thank you! Do you plan to do something similar with SDXL Turbo model?
Just merge it with 30-50 percent Turbo and you are done! :)
Do I have to use the trigger word ~*~aesthetic~*~ with this model? The images on civitai don't seem to use it.
I want to know this too!
I haven't noticed a huge difference one way or the other. I did a test you can see here: https://civitai.com/posts/1518594
(Please be sure to hit like :-) )
Hello there, thank you for this wonderful model ! I really like it But I have the feeling that OpenDalle is quite "stronger" i mean, perhaps a little bit better when it comes to realism than Proteus ? Proteus seems to keep the images stylized when i compare to Opendalle results, or perhaps are my prompts bad ? Should I implement some negatives ? I use the trigger as well
I agree, I think Proteus I overfitting, but that's only my feeling
100% agree, even with Proteus v3 OpenDalle creating just better results with lesser errors and can interpret the prompts way better.
@thetibolus233 do know if OpenDalle is better for text? I grabbed this model for that reason due to seeing people say it was good for it but I'm finding it no better than other models.
@skullzy77 I have never used it with text. SD XL 1.0 is in general not that good in terms of text. The new Stable Cascade is doing a way better job here, just wait for it. But still OpenDalle in all my projects with automated prompts, far ahead. Insane...
OpenDalle is still for me, the best model over here ! Stable Cascade though...mhmm i'm waiting for it haha
@mythosttart445 is this OpenDalle the same OpenDAlle found on civitai? I mean i see alot of raving about it but the images shown on OD page for the generation output, dont seem so great. Wondering if maybe it is better when in use or if there is another OpenDalle checkpoint somewhere else?
which OpenDalle version do you feel is better? Also I got good results with proteus v2 Do you think proteus v3 is better than the v2 model?
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.















