After training prediction give double bounding box

Good morning,

I’m trying to train a yolo v11n model for defect detection in images coming from manifacturing application. The problem has only one class to detect.

After the training I have a strange behaviour. If I test the best produced model on the dataset, I always have back from the model an almost perfect detection (compared to ground truth) and a second one that is always next to the correct one. For example, the correct bounding box is [122, 246, 11, 10] and thr fake one is [133, 246, 11, 10]. The strange thing is that the score is very similar between the two output boxes, and also the class is always the same (0 in my case since I have only one class). The behaviour is the same on all the image taken from train, validation and test set.

I checked all the label file and everything is correct. Any idea about the possible issue?

Regards

Please share your inference code that’s being used. Without the code it will be challenging to help diagnose the issue.

I use the code from tutorial

model = YOLO(file.pt)

results = model([“img.jpg“], conf=0.3)

for result in results:

    boxes = result.boxes
    sprint(boxes)

This is the typical output

0: 640x640 2 defects, 9.1ms
Speed: 1.0ms preprocess, 9.1ms inference, 1.2ms postprocess per image at shape (1, 3, 640, 640)
ultralytics.engine.results.Boxes object with attributes:

cls: tensor([0., 0.], device=‘cuda:0’)
conf: tensor([0.4887, 0.4078], device=‘cuda:0’)
data: tensor([[116.5595, 240.8561, 128.1036, 251.5495, 0.4887, 0.0000],
[128.0958, 240.5381, 139.5009, 251.5142, 0.4078, 0.0000]], device=‘cuda:0’)
id: None
is_track: False
orig_shape: (640, 640)
shape: torch.Size([2, 6])
xywh: tensor([[122.3316, 246.2028, 11.5441, 10.6934],
[133.7984, 246.0262, 11.4050, 10.9761]], device=‘cuda:0’)
xywhn: tensor([[0.1911, 0.3847, 0.0180, 0.0167],
[0.2091, 0.3844, 0.0178, 0.0172]], device=‘cuda:0’)
xyxy: tensor([[116.5595, 240.8561, 128.1036, 251.5495],
[128.0958, 240.5381, 139.5009, 251.5142]], device=‘cuda:0’)
xyxyn: tensor([[0.1821, 0.3763, 0.2002, 0.3930],
[0.2001, 0.3758, 0.2180, 0.3930]], device=‘cuda:0’)

Thanks for the details and the printout. What you’re seeing is expected when two small, side-by-side boxes don’t overlap enough for NMS to suppress one of them. NMS filters by IoU, so if IoU≈0, both survive. Your two boxes are almost touching in x with tiny/no overlap, so standard NMS won’t remove the “twin” box. A quick primer is in our article on Non‑Maximum Suppression (NMS) explained.

Quick options:

  • If there is at most one defect per image, keep only one: results = model('img.jpg', conf=0.3, iou=0.7, max_det=1).
  • If you need multiple defects per image, post-process by merging near-duplicate boxes by center proximity (not IoU), keeping the highest‑conf in each cluster. You can replace boxes in-place with result.update(boxes=new_boxes) as shown in the Results API docs.
  • Improving localization often removes these splits: try a larger imgsz (e.g., imgsz=1024) or a slightly larger model size (YOLO11s) and verify on the latest ultralytics release.

If you can share one example image + its label and your ultralytics.__version__, I can try to reproduce and suggest exact thresholds for the proximity merge.