Full Custom YOLO Detection Model
Model Description
This model is a custom-trained YOLO object detection model for multi-class detection and segmentation tasks on a specialized dataset.
It is trained for fine-grained object detection using bounding box annotations across multiple classes.
NSFW
breast
vulva
butt
penis
anal
vaginal
blowjob
handjob
Custom
face
nipple
mouth
eyes
navel
anus
Intended Use
Object detection on NSFW datasets
Bounding box classification for custom classes
Limitations
Trained on a custom dataset that was boxed with the ten classes by hand.
Performance may degrade on unseen domains or distributions, but in testing on 10k out of database images the error rate was less the 4%
Evaluation Results
Overall Metrics
Precision: 0.858
Recall: 0.808
mAP@50: 0.898
mAP@50-95: 0.6156
Per-Class Evaluation Results
============================
Class | Images | Instances | Precision | Recall | mAP50 | mAP50-95
all | 613 | 1683 | 0.858 | 0.809 | 0.898 | 0.616
person | 233 | 256 | 0.829 | 0.902 | 0.928 | 0.762
breast | 286 | 298 | 0.910 | 0.884 | 0.964 | 0.642
vulva | 165 | 166 | 0.874 | 0.777 | 0.873 | 0.495
butt | 127 | 131 | 0.848 | 0.771 | 0.895 | 0.596
male | 102 | 108 | 0.865 | 0.474 | 0.718 | 0.553
penis | 237 | 269 | 0.824 | 0.855 | 0.903 | 0.577
anal | 145 | 147 | 0.936 | 0.905 | 0.957 | 0.704
vaginal | 182 | 184 | 0.888 | 0.903 | 0.952 | 0.619
blowjob | 36 | 36 | 0.779 | 0.778 | 0.869 | 0.603
handjob | 73 | 88 | 0.830 | 0.841 | 0.925 | 0.603
Description
FAQ
Comments (5)
Is there a tutorial on what to do with this?
The GUI works to auto crop for dataset prep, but most people would benefit from use in comfy or forge, I have reached out to adetailer maybe they will incorporate its use
A really big work for sure. But what about teaching SAM3 to handle NSFW concepts? Do you think this is possible? Those YOLO rectangles aren’t much help - we're inpainting segments, not rectangles, after all.
Adetailer uses the yolo boxes it just apply a kernel size of dilation or erosion using PIL to blend the boxes out. In any instance you have to respect the minimum size of render that the model can generate. 32x32 or 64x64
Some models like QWEN have there own internal smart object in-painting where you can tell the LLM to move something left or right, this doesn't rely on external object detection but internal.
SAM3 is two orders of magnitude larger then YOLO-S (100x larger) the requirements for a full finetune with a reasonable batch size is 6 A100










