[Note: Unzip the download to get the GGUF. Civit doesn't support it natively, hence this workaround]
GGUF version of FluxUnchained by socalguitarist . Credit goes to him for tuning this model. I converted it to GGUF by a modified version of this script.
It can be used in ComfyUI with this custom node or with Forge UI.
See https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050 to learn more about Forge UI GGUF support and also where to download the VAE, clip_l and t5xxl models.
Which model should I download?
[Current situation: Using the updated Forge UI and Comfy UI (GGUF node) I can run Q8_0 on my 11GB 1080ti.]
Download the one that fits in your VRAM. The additional inference cost is quite small if the model fits in the GPU. Size order is Q4_0 < Q4_1 < Q5_0 < Q5_1 < Q8_0.
Q4_0 and Q4_1 should fit in 8 GB VRAM
Q5_0 and Q5_1 should fit in 11 GB VRAM
Q8_0 if you have more!
Note: With CPU offloading, you will be able to run a model even if doesn't fit in your VRAM.
Updates
V2: I created the original (v1) from an fp8 checkpoint. Due to double quantization, it accumulated more errors. So I found that v1 couldn't produce sharp images. For v2 I manually merged the bf16 Dev checkpoint and then made the GGUF. This version can produce more details and much crisper results.
All the license terms associated with Flux.1 Dev apply.
Description
Bigger than Q5_0, should be better
FAQ
Comments (95)
I have one question.. why is this model a zip file not a Safetensor file?
It's because it's a GGUF quant and Civitai doesn't support this format yet. So this is a workaround.
Just unzip it, you'd get the GGUF. Civit doesn't support this format, hence this workaround.
@nakif0968 (sigh) unzips....
Q8 has 4gb file inside. Looks like an error.
What's the size of the zip file? Did you download the correct version https://civitai.com/models/662112?modelVersionId=748232 ?
Zip file is 12gb, but after unzipping it's 4gb for some reason
Perfect🔥🔥🔥.
Now we only need to work on the Ace Hole not looking like an Hole-less meat whirlpool and the Vanjana not looking like a smashed hotdog.
Foge works perfectly, thanks to the author for the good model, under dpm++2m, 8steps has very good results, 2070s 8g, 36 seconds
Hi, could you tell how you got it working? Where did you download vae, and which T5 model you downloaded? Could you give a link or name?
I can't see the gguf model in the Forge checkpoints list. Yes, the model is in webui\models\Stable-Diffusion
@leclettico912 Is your foge up to date? https://github.com/lllyasviel/stable-diffusion-webui-forge
Running ComfyUI on linux, have ran the pip upgrade and made sure ComfyUI is updated, but the GGUF nodes keep failing to import. Does anyone know of a fix?
Unable to run in Forge UI. Give me the followin error "AssertionError: You do not have CLIP state dict!"
You need to download T5, CLIP-L and VAE separately and put in the corresponding folders. See here for more info https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050
Also, be sure to load them in the UI (along with setting the checkpoint). [How: See the pic Illyasviel shared in the above link.]
@nakif0968 Thanks a lot mate. This is really helpful.
@Light_Saber No problem.
@nakif0968 Can you provide more details? There no any mention of clip_l on this page at all.
@velanteg from that page: "Put clip-l and t5 in models\text_encoder"
I've downloaded t5, clip and VAE but after this I have error : ValueError: Failed to recognize model type!
Any chance for a Q2 or Q3 version?
If the gguf-py library adds support for them I can give it a try. But for now, I don't know how to do that.
@nakif0968 Thanks for the consideration. I'm just getting into gguf as well. I'll take a look as well.
Unpacked imto models folder but Forge not see gguf checkpoint.
can you turn on on site generation? i really wanna try this one!
Civit doesn't support any Flux base model yet. That being said, this one's meant to speed up local generations anyway.
@nakif0968 ah gotcha. that explains why i can only find 2 checkpoint flux models that support on site generation. thanks for the info 💪
Thanks for your work! Any recommendation of number of STEPS and sampler? I got incredible results in Comfyui with steps 8 and EULER/Normal, 768x1024, on my 6G VRAM card, using Q3. I can load Q4 (my NVIDIA driver supports offloading VRAM to RAM), but outputs are a bit pixelated. It only gets better with 15 or more steps, but not so sharp as Q3. Is there a way to use PASEER or those lightning loras? Thanks!
Euler + Simple schedule. I haven't tried anything else.
For some reason, your models is downloaded in the archive and, probably, that's why after it is unzipped, my hash does not match what is listed on the site. Because of this, the model is not displayed in the resources when uploading the image to CivitAI. Of course, i can manually insert the correct hash of the model into each the png, but it's boring((
Maybe because it's a zip file. If Civit supported .gguf it wouldn't be a problem.
One of the best. clear and sharp while some other FLux version give me some noise. But i have problem. Yesterday, its work seamlessly with flux lora. But today, it's broken. image generated but without any lora effect. Anyone can exlpain this case?
nice. used q8 v2 version. on rtx 3060 an 896x1152 image takes about 1:24 minutes. first model to give me tentacles wrapped around diver. thanks for release. see image below.
the girls in this model seem prettier than other flux models. they don't have the square dimpled 'male' look to them... good job.
really good following prompts... better than some of the other flux models i've used.
Can you please include training some Asian female figures and faces, I think it's already excellent for Europe and the US, but the support for Asian characters isn't good enough
Loads really fast.
Generation is slow. over 1min for 20step images.
Q4 generates high quality images and loads fast with 8GB vram.
Can do SFW and basic NSFW.
Nice model over all.
We need faster Flux models + Less v-ram hungry ones, otherwise it's not worth it.
silly question....
is there a prompt/key phrase to get smaller breasts and nipples? i've tried small, tiny, flat chested. seems to have no affect.
thanks in advance.
Has anyone tried using loras with this checkpoint? I'm trying to use AIENGI models when same prompts and they just don't look right :(
https://civitai.com/user/AIENGI/models
Hi! Do you plan on creating GGUF model for the HyFU-8-Step-Hybrid version of Flux Unchained by SCG?
Please do if you can! Thank You.
May I know how long it takes to render an image with your 1080 11GB GPU?
~12s/step * 20 steps = 240s = 4min
@nakif0968 wow that's not too bad. any tips on optimising it further in Forge UI?
Can't think of any beyond what's already on the Forge UI Github
I am not saying you are wrong with this comment, I just want to know what I am doing wrong. You say that gguf is faster than safetensor in the safetensor upload, but with my 3090 I am getting 35s/it instead of 1.3s/it that I get with the safetensor. I must be doing something wrong, but i have no idea what
GGUF is not faster than fp8 or fp16, it is slower. Thats because has a compressed data format. Th egguf versions are just smaller to get it in your vram. With a 3090 you should have no problem using fp16...
@Sendrael ahh, right. Does the fp16 have the vae, clip and text encoder all built in? I will try it out in any case. Thanks for letting me know, I appreciate it :)
Also, it seems that the gguf issue fixed itself, it
Edit: could you point me toward the model page with the fp16 to download? I'm having trouble finding it
Edit 2: Ok, found the dev. Now looking for schnell.
Edit 3: what should I put for the VRAM GPU weights slider? The same size as the model file, smaller or larger? and if one of the latter 2, by how much?
Edit 4: I am actually getting .2 s/it slower generation speed using fp16 over the ggufs.
awesome stuff keep up the great work! btw, how do you convert/change from a safetensors to a gguf format, is there any tools to do this you can recommend, thanks! :)
Are these all quants of Flux Unchained version 1.0? The "v2" in the titles is confusing.
V2: I created the original (v1) from an fp8 checkpoint. Due to double quantization, it accumulated more errors. So I found that v1 couldn't produce sharp images. For v2 I manually merged the bf16 Dev checkpoint and then made the GGUF. This version can produce more details and much crisper results.
I've also used the ComfyUI-GGUF tool to convert my own models, but Forge doesn't work properly. Could you please share how you did the gguf conversion?
Use this fork https://github.com/mhnakif/ComfyUI-GGUF
@nakif0968 Thank you!
Thank you for your work. Very appreciated. One additional request... Would you please make this available in the Q6K version? I've found that to be almost as good Flux Dev, but saving a couple extra GB of VRAM that I need on my 12GB GPU.
1st THX 1st - thanks for sharing...
But I think decision to release it as a ZIP is very inconvenient. I am comparing different GGUF releases of FLUX and it is constantly A BIG PAIN in the letter "S" to check which images should be posted on which CivitAI page.
The name of ZIP archive differs from the name of GGUF inside. Why did you decide to compress the GGUF format in the first place? The compression profit is almost zero.
It's not for compression. Currently, CivitAI does not allow the upload of a .gguf file format, but it allows you to upload a .zip. Hence this workaround.
@nakif0968 I suspected that... sadness. So yet another workaround needed - rename file after extraction. But hash code of GGUF won't match a has of zip on CivitAI.
OK. Sorry for blaming you...
Based on my tests, I got the best images at 768 resolution. Thanks, I really like this model!
Excellent models, thanks for your hard work and sharing
Very good but very horny model. I can't generate any women wearing bra, they're always topless.
I don't understand what is it means "Q4_0 and Q4_1 should fit in 8 GB VRAM". What about clip models that loads in vram too? For me the speed the same as with regular big models, because its can be only "loaded partially" (the clip model is loaded first and uses all memory). Do I miss something?
Download the extra models extension and use the force set clip device to set the vae to cpu this will save you the vram
@chrislgolden130 which extension exactly? Is this for ComfyUI. I only found nodes created by city96, its for another models and dont support dual load. Also vae is very small and loads after generation so it doesnt need to be loaded to cpu, because low_vram unload models each time anyway
@waitran Sorry I meant clip not vae. its "comfyui extra models" in comfy manager.
To get Q4_1 v2 to work in SD Forge on my RTX 4060 w/ 8 Gig VRAM, I found that I needed to lower SD Forge's GPU Weight setting from the default value (7163?) to a slightly lower value or it would immediately run out of memory. After I got that figured out, it has worked very well. This checkpoint provides the capability for realistic nude figures straight out of the box with no other LoRAs needed. It also seems less finicky and much more responsive to various NSFW prompts than many other base/checkpoint flux models I've tried.
DUMB QUESTION:
sorry if this is dumb question. do i need different type of loras with gguf models? when including loras in prompt i get heavily pixeled/unusable images... or am i just lucky. :)
The same loras should work for GGUF. Tested in ComfyUI, not sure about Forge.
hmm... tried fluxunchained-dev-q8-0.gguf and get same behavior as with my model... i am using forgewebui.
@tedbiv Maybe a ForgeUI issue. You might wanna report it on their GitHub Repo
here's what i found -
GGUF Flux Models Require LoRA
Based on the provided search results, it appears that GGUF (Generalized GPU Unified Format) Flux models require different LoRAs (Locally-Optimized Rerouting Algorithms) compared to traditional Flux models. Here are some key findings:
Compatibility issues: Some users have reported compatibility issues when trying to use LoRAs designed for traditional Flux models with GGUF Flux models. For example, Issue #57 in the x-flux-comfyui repository mentions that a LoRA designed for Civitai works perfectly on an online service running the FP8 dev version of Flux, but not with the GGUF Flux Dev Q4_0 model.
XLabs LoRAs: The search results suggest that XLabs LoRAs are designed to work with GGUF Flux models, providing better performance and generation times. In contrast, Civitai LoRAs may not be optimized for GGUF models and may lead to slower generation times or compatibility issues.
Quantization: GGUF models are quantized, which means they have been optimized for reduced memory usage and faster inference. This requires different LoRA configurations compared to traditional Flux models, which are typically not quantized.
Node updates: Some users have reported issues with LoRAs not working with GGUF Flux models after updating nodes. For example, Issue #4674 in the ComfyUI repository mentions that updating the node to the latest version resolved issues with LoRAs not working with GGUF models.
The same problem is that every time the image is generated, there is a lot of noise or mosaic, and I haven't even used Lora yet. I tested Q8, Q5, and Q4, but none of them worked. However, other versions of the GUFF model can work well. Although I suspect that there is a problem with VAE decoding, isn't FLUX's VAE only available in the AE version?
Any update on this? I keep on getting consistent error message on ForgeUI no matter the combination I use for the other files.
anyone dial in a sampler and scheduler for best results or fast results?
I use Euler + Simple
hi any workflow for this? you try teacache o wavespeed for flux?
The same problem is that every time the image is generated, there is a lot of noise or mosaic, and I haven't even used Lora yet. I tested Q8, Q5, and Q4, but none of them worked. However, other versions of the GUFF model can work well. Although I suspect that there is a problem with VAE decoding, isn't FLUX's VAE only available in the AE version?
I’ve heard that some people have noise in their gens, but I have not been able to reproduce this issue. And there really shouldn't be any ambiguity regarding FLUX VAE, afaik, there's only one VAE for FLUX, the official Blackforest Lab one
set 'diffusion in low bits' to 'automatic (fp16 lora)'
@pretty_pixels Yes, it seems to be just a problem encountered by a small group of people. I have 3070T 8GB Vram, and I can actually run Flux D FP8, but GUFF is faster. Currently, only the "Flux Unchained" version has encountered a lot of noise/damaged images. If it weren't for VAE, I really wouldn't be able to find the reason. Thank you again for your efforts, and I will continue to follow up on your work
@tedbiv
I will try,but How do we get this in comfyui, i have no idea if there is a node to select those?
oh, sorry i don't know. i use forge for images. i only use comfy for videos.
@tedbiv thx anyway :)
Why is everything I create in a comic style?
There is this new quantization technique called svdquant nunchaku, which is smaller is size and very fast, about 3x times faster. Any plans to release your unchained model in this svdquant format.
So you create a model for people with low VRAM but it starts at 8GB, so if someone has 6GB we don't exist. Bruh, you're awesome could you buy me a video card? I'm poor like the human species in general.
I can't tell if you're trolling or not, but I'm gonna give you a straight answer. There is Z-image Turbo and Flux Klein 4B, which would run fine with 6gb cards. Those are newer, better models. This model is very old. Don't use it. There are better options. (And, BTW, I didn't "create" this model; that would be Black Forest Labs).
@pretty_pixels Thanks, that's very helpful. Yeah I'm a bit upset because this is like an elite community of gurus throwing an impossible amount of acryonyms and technical concepts around and everything out there is outdated or pure disinfo. Not your fault of course, but I've never seen anything like this in my life. It's like get a PhD to understand everything written on a single page anywhere. Search whatever, it's outdated or part disinfo, you have to be a genius.
@forum2233726 You can always ask an AI chatbot like Gemini,Grok, or chatgpt to get up to speed. I still use them to learn about new models and tools.
@pretty_pixels Ok after 10 days of doing this every day I'm finally starting to get it. Yes the Q4 models do work on 6GB. Image generation can take 3-4 minutes or more on regular models and an older GPU, unless you use the Hyper 8 Lora, which is really good.
Btw, this model is still in the top 30 highest rated Fluxdev models on CivitAI. I heard Nunchaku is the latest thing. Looks like Flux 1 is starting to approach SDXL level of NSFW models but nowhere near there yet without extensive LORAs.
