I am conducting experiments using the YOLOv8m model for small-object detection on medical images. Despite adding a P2 detection head, the performance of one specific class remains low.
Interestingly, this class has a higher number of instances compared to other classes, yet its detection performance is still poor.
I would appreciate any suggestions on architectural modifications, training strategies, or hyperparameter adjustments that could help improve detection performance for this class.
The other aspect that will impact small object detection is the quality of the ground truth annotations. Looking at the example you posted, I would suspect that the bounding boxes for the ground truth boxes might be “loose” meaning they are not tight to the object boundary. I say this as I have done detection projects similar to yours and observed the same type of model performance on small objects. I recommend comparing the ground truth against the model predictions directly and verifying that the ground truth boxes are tight against the object boundary (no background or other classes between object boarder and box edges). Additionally, you may consider using a segmentation model, as you can better define the perimeter of each object
You mean I should make a prediction and then compare it with the ground truth?
In addition to that i want to clarify, what do you mean by loose boxes? These species are very tiny, so in theory by making the bounding boxes a bit bigger might increase the chances of being detected. In terms of loss, parameter, or metrics are there any changes i can make that might improve my results?
Yes — doing a quick qualitative check by running predict()/val() and visually comparing the rendered predictions vs your ground-truth is one of the fastest ways to diagnose why a single class is underperforming (mislabels, class confusion, inconsistent box style, objects too tiny after resize, etc.). The val run also saves example batches with GT + preds side-by-side in runs/detect/val*, which is ideal for this.
By “loose boxes” we mean boxes that include a lot of background around the object (or are inconsistently sized/centered across images). For tiny targets this usually hurts training: the model is asked to regress a box whose pixels are mostly background, so the visual signal becomes weaker, and localization gets noisy. Making boxes bigger typically doesn’t increase detectability; it just teaches the model that “background belongs to the object,” and it can reduce IoU-based metrics even when the model finds the object.
If you want levers that actually move small-object performance, imgsz is the big one (as mentioned), and the next most effective is keeping more native detail via cropping/tiling into patches so the object occupies more pixels. The Ultralytics guide on evaluation/inspection is helpful for reading per-class metrics and failure modes in a structured way in Insights on model evaluation and fine-tuning, and for small-object-specific intuition see Exploring small object detection with YOLO11.
If you can switch models, Ultralytics YOLO26 is the best default for this in 2026 and includes explicit small-target improvements (it’s a meaningful upgrade over older YOLO versions for tiny objects). A minimal starting point:
If you share one val batch image (the saved GT+pred composite) and your object sizes in pixels after resizing to your current imgsz, I can suggest whether tiling vs larger imgsz is the better next step for your specific case.
I performed a qualitative comparison using different confidence thresholds. In the first experiment (conf = 0.10), the predicted bounding boxes (blue) and ground-truth bounding boxes (green) are shown. A larger number of small parasites were successfully detected at lower confidence thresholds.
The model architecture was modified by incorporating a P2 detection head while removing the P5 head, and GhostConv and C3Ghost were employed to reduce computational complexity. All experiments were conducted with an input resolution of 640 Ă— 640. if i ncrease the confidence , small parasites are not detected at all.