This is tests of training a flux model, and all licensing conditions for the Flux Dev model apply. You can find the details here: https://huggingface.co/black-forest-labs/FLUX.1-dev. It's a great model!
A huge thank you to Datacrunch.io for generously providing their resources for training and support. I highly recommend them.
This is Checkpoint. CLIP and VAE is already on
This is my model trained on Flux, and it aims to create more diverse and realistic faces. The results show improvements in anatomy, facial features, and overall realism. However, a side effect is that noise appears in the images. This can be corrected with any upscaler - just find one that works for you. The model was trained on approximately 7,000 images over the course of several hours using powerful H100 cards. If the test results are mostly positive, I will train it on a larger dataset.
The number of training steps was limited due to budget constraints. Additionally, I used a small portion of my dataset, only 7,000 images, to reduce training time. When I get the opportunity, I plan to train a more refined model. You can use this model as you like, but please remember to adhere to the Black Forest Lab license terms.
I welcome any feedback in the comments - feel free to share any errors or shortcomings of the model. This is just an experiment.
Description
I am excited to announce that a gguf version of the Flux_realistic_SaMay_v2 model is now available. While it has lost some details compared to the full version.
FAQ
Comments (84)
Very nice model, thank you.
Can you please post FP8 UNET only?
Having FP8 would be nice.
But GGUF Q8_0, would be better.
It's much closer in quality to the full size model, and about the same size as FP8.
@jr81 I know, but fp8 can do much faster generations wirh 4000 series cards (fast flux optimisations)
Full Precision UNET only would be great :-) thank you for your work
It's already there ! It's the 22 GB one.
What we're missing here is a GGUF Q8_0. The one we have is Q4_0, I think.
Can you please add a GGUF Q8_0 ?
Your GGUF model is Q4_0, judging by the size of it.
But Q8_0, is much closer in quality to the full size model, and still about half it's size.
Hey Q4 is incredibly useful too, so best to have it too
Yeah, Q4_0 is good quality for its size.
I also use it, along with Q8_0 :)
This Model is Bananas , B-A-N-A-N-A-S
Without B it would be Pineapple.
Thanks, I hope you can make continuous improvements
Compared to Acorn is Spinning Flux, this one shines in diversity of composition, and is on par in realism. And what's most important, it really respects the prompt! The downside would be rare insignificant artifacts and messed up logic sometimes (got a boat swimming crosswise its own trail).
In the off-chance that someone else encounters the same issue I had in SwarmUI where the GGUF model wouldn't load properly (even tho I have several other GGUF models that did).
I needed to update the ComfyUI-GGUF custom node. Then it was able to load correctly.
What quant is that GGUF? 4? Can we get an 8?
The realism of the images that can be produced by your Flux Model is extraordinary, but there are shortcomings in terms of creating the anatomy of the hands and feet
how come the fluxdevs can steal data from civit AI but nobody is allowed to steal from their model. How can this be legal?
Screw the flux license.
Agree. Corporations get away with far too much.. They always steal from everyone, customers, people working for them. It's a big systemic scam, and everyone below the top is gettin played.
All AI images are stolen from Artists. Fuck AI
What evidence do you have??? You realize right all AI images are stolen from real Artists
@KingArthur88 Technically not true. I mean, in terms of art content, yea. But there's also a lot of RL images. I get where you're coming from, cuz I had a bad attitude towards this stuff before I got into it. And I agree that it's had a bad effect on art overall. It's gonna destroy artistic ambitions in the future. Hell, I was just getting halfway decent myself on my graphic tablet, but there's not much sense in putting in all that effort when it can be done with a generator.
I suppose if anything, real art will eventually become, real artists training their own models/loras and selling those with special license to reproduce their styles.
Well, if you really want to be mad at anyone, get mad at those responsible for AI being used to replace artists and musicians instead of CEO's and greedy/corrupt politicians.
@KingArthur88 well try rendering the "Lays Chips" . You can see that the results are almost 99.99% identical from the images rendered with "Lays Chips SDXL Lora" using SDXL. They have obviously stolen the data from this and merged it into their own FluxDEV model. How is it legal for them to steal this LORA Data and then make a "You cant use it hehe"-license on it?
@cathyleverman One word, Lobbying. The wealthy make the rules. It's like letting a five year old make the rules when playing monopoly. No one is gonna win/have fun but the kid. Same concept for the wealthy elite under capitalism(AKA; socialism for the wealthy). Have you actually read legal paperwork? It's practically in Chinese. Cuz if any normal person could understand it, a lot of people would stand up and go 'wtf is this shit..?' The legal system is built in such a way to not be understood by the common man.
@Lazman its quantum law. You are both allowed and not allowed to do anything, depending on what whims the judge will sentence you on :D
@cathyleverman Ain't that schrödinger;s cat?
@Lazman it would be fun to find what other LORAS they have stolen and then rebranded in a restrictive license that benefits only them.
This type of license fraud and license terror seems to be quite a common new behavior in the Generative AI world.
@cathyleverman Corporations and the wealthy stealing the work of those with no money and/or those that work for them and branding it as their own, is nothing new. Steve jobs didn't invent the Iphone, he had no damn part in it, and when the idea was first brought to him, he rejected it. But the prick acted like he made it himself. It wasn't even Apple that made the thing, not entirely. It was a company that they bought up, which probably only went under in the first place cuz of corporations like Apple controlling the market and pricing them outta business. Jobs didn't even invent the Macintosh, that was Wozniak, jobs just took all the credit, that guy was just a sleazeball in a turtleneck.
@Lazman pretty much
@cathyleverman That isn't to say we should give up on raising awareness of such practices though. People have grown unbelievably numb to corporate control. In 1984, people in America literally rioted in the streets, I mean like, filling the streets with people, over their distaste for the flavor of 'new coke'. These days, the richies double our rents in just a few years, and the most I've even seen is a parade of like, 30 people bashing Trudoh.
Corps and the gov have learned to master a mix of psychology and diplomacy, and it's almost impossible to fight against the passive aggressive nature of it all..
By any chance, would you mind sharing the config you used for this? Or some simple stats like step count, learning rate, learning scheduler, optimizer used, etc? Thanks!
Thanks for this. Any chance of a UNET only version of V2 ?
The 22 GB mode is UNET, ( in FP16 ).
Sorry, should of mentioned FP8 would be good. It means your model will work well in 16gb, possibly 12gb of vram at a decent quality and speed. Gguf is ok , but not as fast as FP8 and possibly lower quality. Your V1 unet only model is 11Gb and works really well.
I've just managed to convert your FP16 V2 to an FP8 version using a python script ..so far it seems to work well.
@jr81 weird because forge asks for T5/clip etc when i use the 22gb one
@Nmdrcn The UNET model requires T5/clip and vae.
A "checkpoint" has all included ( at least that's the term used in ComfyUI ).
But I think there are only Flux checkpoints in FP8 ( about 17GB )
@jr81 forge is different. only |unet only| models don't seem to ask for t5/clip and vae. This 22gb model asks for them regardless
When I run it on Forgeui, it is extremely slow. It shows like 1hr for a generation, even though I have a 4090. What settings do you suggest I use for it to run?
Any chance you can upload this on tensor as well?
'NoneType' object has no attribute 'tokenize', ,,the compact model v2 will cause this error message
same happens to me
+1
Kohya scripts failed to train fluxRealistic_fluxRealisticSamayV2
how did you train it?
Couldn't get the V2 22 GB model to work in my Comfy Flux workflow. Results looked really promising but not ready for prime time.
It worked in ForgeUI for me, I use a 12GB GPU (RTX 4070 Ti)
Take your GPU RAM and subtract 4096 of its RAM on the GPU Weights. In my casa, I took 4096 out of 12288 on the GPU Weights. Then set Swap Method to queue and Swap Location to CPU.
My computer has 32GB of RAM as well and I added more 32GB of RAM as virtual memory. Sometimes it crashes, but it works most of time.
Can you tell us what tools were used for training?
what is the learning rate and train step for the 7000 images in your training
What are the recommended generation settings?
I would also like to know the official settings for this model.
But also, what custom settings do the users use.
Can you also add a Q8_0 version ?
It would look closer to the FP16 version.
What quantization is the current gguf? I usually use a 5K_S or 5K_M
This is probably Q4
You can tell by the file size.
Here are all Flux dev quants, by the ComfyUI-GGUF extension developer ( take a look at the file sizes ):
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
@jr81 Thank you!
Should we expect version 3?
Version 3.2 is almost done, it's trained on 20,000 images, but I'm not sure I'll be publishing it yet
@Sa_May Please publish it :-) :-(
@Sa_May ...oh! yes please, publish it. v1 and v2 are showing a great trend :-)
Would really appreciate a Q8_0 quant of this. Right now there's the less accurate quants for low VRAM users, and the fp16 non quantized version for people with extreme amounts of VRAM, but none for us in the middle :(
Is it possible to quantize models like this one using llama.cpp? It has a mini program that quantizes models to q_whatever, but I've only tested it with LLM so I'm not sure if there's a different process for diffusion models.
@jtabox From what I've read it's similar or the same to quantizing llms, which doesn't take long. I can definitely do it its just that the base models are so big it's annoying for me to download, but overall yeah I'm just lazy. (but for the average civitai user they for sure won't do it themselves lol). This site will soon have its own version of "TheBloke" or "bartowski" soon I bet.
I'm a noob when it comes to stuff like this, and I managed to do it for another checkpoint using this guide.
depending on your level of expertise. Note you can import directly from Civita using the d/l URL too.
you can quantize this or any other model yourself using the optimum.quanto library or bitsandbytes. Here's an example:
from diffusers import FluxTransformer2DModel, SD3Transformer2DModel
from diffusers import UNet2DConditionModel
from optimum.quanto import freeze, qint8, quantize, quantization_map
from pathlib import Path
base_model = 'C:/Users/xxxxx/.cache/huggingface/hub/Civitai Models/fluxRealistic_fluxRealisticSamayV2.safetensors'
dtype = torch.bfloat16
transformer = FluxTransformer2DModel.from_single_file(base_model, subfolder="transformer", torch_dtype=dtype)
quantize(transformer, weights=qint8)
freeze(transformer)
save_directory = "./flux-dev/fluxRealistic/fluxtransformer2dmodel_qint8"
transformer.save_pretrained(save_directory)
qmap_name = Path(save_directory, "quanto_qmap.json")
qmap = quantization_map(transformer)
with open (qmap_name, "w" , encoding= "utf8" ) as f:
json.dump(qmap, f, indent= 4 )
print('Transformer done')
Okay, what difference will Q8_0 make? I checked the Flux GGUF models. Q8_0 still is big and needs higher amount of VRAM. I don't understand how you can be in the middle. I have a 12 GB card and the FLUX GGUF Q8 slows down my PC like hell. I have to just shut it down. From 3 days I am not having any luck of generating anyting good or sexy.
@NOOBDA The next logical step is to scale down to like nf4. (gguf or bitsandbytes or quanto) to reduce. your choice.
Great model. Hope you publish v3 version of it
what do you mean by "This is Checkpoint. CLIP and VAE is already on"? Also, can you share your workflow for the gguf one?
bro you need downloads flux's vae and comfyui gguf node
CLIP and VAE no included
Import does not work for Mac OS. Suggestions? Tried via downloading file and pasting URL.
What is the going Vid2Vid workflow for this model?
Can this checkpoint be used on webUI?
If you get the SafeTensor run it by itself since it's baked in. If you get the GGUF then make sure to have your vae and clip models (T5 & ClipL) loaded in the text encoder box
***Edit: if you are running Forge***
bro you need flux‘s vae and clip
@a114514lh929 "This is Checkpoint. CLIP and VAE is already on" so we still need one? im using the 10 g version
you should change the sampler to euler + simple
@redforce770387 "AssertionError: You do not have CLIP state dict!" XwX
What sampler do we use? I can't find any info on what steps, cfg or samplers to use
ddim + beta, 18-26 steps
@Sa_May is it applicable for the nf4 model also?
@torc007689 better euler + simple, 18-22
i can't find a version of this that includes the text encoder in the checkpoint, anyone know any models like that? colosus project doesn't do well with faces
fp8?
这个模型是不支持局部重绘吗,为什么我重绘的部分都是灰色的
Details
Files
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.









