Out of bounds coordinates

Hi all,

I’m having issues understanding how Yolo deals with coordinates and image sizes. As a consequence, I’m getting an out of bounds error message. Below I’ll describe the problem I’m having, what I have done to fix it, and my understanding of what the relationship between image size and coordinates are.

I am training an object detection model with 129 classes and close to 30,000 1920x1200 images. Those images were each labeled using x1, y1, x2, and y2 coordinates. After creating my yaml file and distributing the images/labels in their directories (test/train/validation), I proceeded to train the model. I got the error below in quotes. Please notice that all of my images (I’m only showing a subset here) are corrupt, probably meaning that they all had issues with the out of bounds error.

After reading up on potential issues, I transformed all labels to the center-x, center-y, h, w format. Got the same error message. Then I transformed the labels into the normalized center-x, center-y, h, w format. Same error. I then fed the model the original image size (imgsz=[1920, 1200]) and got the same error. Below I’m providing the labels showing the coordinates for one of the images for the cases above.

What I understand about image size and box coordinates is that the transformations are dependent on the image size provided. If no size is provided when transforming coordinates, the image size is scale to width=640 and the height to whatever value that will preserve the aspect ratio. Also, when feeding the image to the model, it gets resized to the values you provide. So, if you format the coordinates based on some image size and feed the model with images of some other size, the model will at best not produce accurate results and at worst produce an out of bounds error. Is my understanding correct? I also believed that the Yolo format for coordinates was x1, y1, x2, and y2. I read today that the model requires the center-x, center-y, h, w format. Did I read this correctly?

So, my request is for further information and, maybe, a workaround for my issue. If you can guide me further, that will help a lot. Perhaps a resource, link, or video explaining in more detail what I need to solve my issue. Thank you indeed,

Ralf

Error message:

WARNING updating to ‘imgsz=640’. ‘train’ and ‘val’ imgsz must be an integer, while ‘predict’ and ‘export’ imgsz may be a [h, w] list or an integer, i.e. ‘yolo export imgsz=640,480’ or ‘yolo export imgsz=640’

train: Scanning D:\Dev\FishCam\data\train\labels… 1163 images, 0 backgrounds, 1067 corrupt: 100% | 1163/1163 [00:02<00:00, 406.60it/s]
train: WARNING

D:\Dev\FishCam\data\train\images\R167_Dermatolepis_inermis0001.png: ignoring corrupt image/label: non-normalized or out of bounds coordinates [2.3203125 1.9875 1.1929687]

x1, y1, x2, and y2 coordinate for one sample image (first column is the class):
14 1414 569 1556 682
74 1087 455 1457 1072
113 234 473 709 647

center-x, center-y, h, w for the same sample image:
14 1485.0 625.5 142.0 113.0
74 1272.0 763.5 370.0 617.0
113 471. 5 560.0 475.0 174.0

Normalized center-x, center-y, h, w for the same sample image:
14 2.32 0.98 0.22 0.17
74 1.99 1.19 0.57 0.96
113 0.74 0.87 0.74 0.27

Probably because you haven’t deleted the labels.cache file generated inside your dataset folder, so it’s still loading the old coordinates. You should delete it every time you update the labels.

Let me do that. But, on a different note, what is Yolo’s coordinate format? Is it x1,y1,x2,y2; center-x, center-y, h, w; or normalized center-x, center-y, h, w? I thought it was the first. Thx,

Ralf

Nope. Not the labels.cache problem you mentioned. The issue still persists after I delete the file prior to training. Any other possibility? Thx so much,

Ralf

It’s normalized. The format is described in the docs.

What’s the error now? Can you share the error?

You list your images as having dimensions (1920, 1200) and using:

during training. The example you give:

Shows the max x-axis dimension to be 1556, which could mean that you have your dimensions backwards. When specifying image size, the first number is height and the second is width. I don’t think the issue is with the training argument. Looking at the converted and normalized coordinates

The x-center for the first and second annotations are >1.0 which will throw an issue and points to the fact that the axes might be backwards for the annotations. All normalized coordinates should be in the range [0.0, 1.0]. I don’t know how it’s occurring, but something is not correct with the coordinates you shared, because \frac{1485.0}{2.32} \approx 640 and the same holds for the other centerpoint values. Assuming the dimensions of the images are 1200, 1920 (w, h), then the expected value of the first annotation would be:

x-center y-center box-width box-height
1485.0 625.5 142.0 113.0
0.7734 0.5212 0.0739 0.9417

Double check your coordinates and then your conversion calculations, as it appears this could be the reason with you’re receiving an out of bounds error. You can also see this page for some of the ultralytics built-in functions for box coordinate conversions, specifically the xyxy2xywhn function would be of interest. This would be how you could use it:

import numpy as np
from ultralytics.utils.ops import xyxy2xywhn

x = np.array([1414, 569, 1556, 682])
normalized_ bbox = xyxy2xywhn(x, w=1920, h=1200, clip=False, eps=0.0)
1 Like