This is a Model Trained on High heels, and helps with the appearance.
This Model is based on the Yolov8m Model, and trained on around 500 Images. I'm interested how everyone can use the model. In Adetailer you should play around with denoising strength, as the standard setting doesn't do much. Just unpack the zip an throw it into the Model\Adetailer Folder and Restart your ui. It should then appear in the adetailer dropdown menu.
To the settings i use most of the time:
In Settings i set (Strict) SDXL Only, that means that the detected Box will be zoomed to the Canvas size set in Generation (for example 1024x1024 or 1216x8329) that gives way more Details during Inpainting.
if i use a Denoising Level beyond 0.45, i place one of my Loras in the Positive prompt of Adetailer including a short description what i want to see without any sort of quality tag.
Description
Handpicked Images from all around.
Resized or cropped to 640x640
Trained with Ultralytics on a 4090 with a Batch of 20 for 300 Epochs on a Yoloy8m Model
FAQ
Comments (13)
Hello.
1) So this model will "detect" heels, instead of hands or face like other models? If I get this right?
2) How does someone "train" the yolo model? I mean I have already seen tutorials on how to train loras checkpoints etc, but I had no idea we can train the adetailer models? Could you tell me more please?
Yes, you are right. Normally you use that model in conjunction with Adetailer!
It detecs mainly the shoes, but also feet due to the open design of some high heels, but this Model just marks the area that will be zoomed for additional inpainting. So it doesnt really matter whats inside the marked frame, everything inside gets a 2nd upscaling pass.
To explain it shortly it just mark in this case a shoe so Adetailer can crop an resize it, then stable diffusion does its magic on a bigger cropped image. After that adetailer blends that crop in the correct size into the image again.
For the detection itself you can train your own model in this case based on a Ultralytics yolo v8 m sized detection Model.
The Training itself is not the problem. The tedious work is assembling, cropping and preparing the Dataset.
I cropped around 500 Images by hand to 640x640 (I think), then Placed Bounding Boxes in every Image with the Annotation. The Training after that took around 45 Minutes an stopped automatically.
https://docs.ultralytics.com/modes/train/#train-settings
If youre interested you should check the page.
The command i used was relatively simple and landed spot on.
Not every pose works, but most of the time its fine.
@nobbikr聽That's so cool. To be honest I am used to follow visual tutorials that teech you how to train (for example dreembooth or flyx gym, I simply have to prepare dataset caption it then go press buttons (change learning rate or batch size and what not) but the rest is straighforward. When I opened your link, All Is aw was a bunch of text and tables. Is there a "program" similar to dreambooth and fluxgym. I mean a Graphical interface? OR is it.. Or wait I went clicking on the main site I see "quick start" there is some video. I see it might CLI based? (meaning you do it al by terminal commands?) I can do it as long as there is a video explaining everything. Ok I will do more research maybe there are some videos out there? I want to ask you though other stuff. You said captioning and RESIZEING was the most difficult, could you share the dataset, or at least a part of it (like 10 images)? I mean the techncial stuff that people might find very useful and cannot find elsewher. I am honeslty shoked to learn about this this si the first time I see this lol.
@SafetyAction聽The Dataset consisted of many different images of persons wearing different kinds of high heels. Best source for me was ali express :D
You need to install:
pip install torch torchvision torchaudio <-- you need to choose the correct version for your system
pip install ultralytics
The Base command was:
yolo train model=yolov8n.pt data=path/to/your/data.yaml epochs=100 imgsz=640
I used the basic configuation without changing a lot
Thats all.
The tools you need
LabelImg: https://blog.roboflow.com/labelimg/
Images used ideally 640x640
Important:
The Yolo needs a Training set of images and a validation set.
https://learnopencv.com/train-yolov8-on-custom-dataset/
Sadly i don't have the Dataset anymore, I was a bit stupid and deleted the Folder with the prepared script.
I didn't had the motivation to redownload everything 馃槄
And the Dataset preparation is tedious, you need a lot of images for training.
@nobbikr聽Its so sad that you deleted them, because that would have been so informatirve ( for other^^) you could have done a video to remmeber the info for yourself in the future you when revisit them. While you still have the memroy frech, what do you think of redoing lets say.. 2-3 images, thats it? Just to retrain your mind lol and so I can see how you prepare them haha. Thanks a lot for the message wiht intructions I am saving it now!
@SafetyAction聽I was using this tutorial https://civitai.com/articles/4080
Theres should be anything you need to know incl. some example images.
I was thinking about training a yolo model on "special" clothing. But the dataset needed would be around 10.000 images which has to be manual annotated. That would be Month of tedious work.
@nobbikr聽wow. Maybe then we should work in team? I also have some ideas. If you could show me how you annotate/prepare one or 2 images, I can get going. I will check the link you provided now.
@SafetyAction聽I did nothing special, hed describe anything incl. the annotation.
Its' simple. Open the image put a Frame around the image you want to detect, and save everything in the correct Folders.
This is a example how the annotation looks like.
With the annotator program 2 Files are created, first ist the image with this frame, the second is a text file with the coordinates.
@nobbikr聽just make sure to not delete your account I will be back to ask question when I get into training this lol. (I have others things to finish first)
@SafetyAction聽Nah, that wont happen in the near future :D
There's a "Detection" model type now you could change this to, so people can find it easier. I'm commenting this on most Yolo models that are still stuck in "Other".
Thanks, then i will movee it 馃憤
@nobbikr聽I'm just glad we finally got a section for these models, it's an important category. Thanks for your contribution!

