The need for specific bbox formatting

rriedel · July 1, 2025, 6:20pm

I have a general question on object detection in computer vision. As you all know, for detection, you’ll need images and associated text files for labels. To train a model from scratch (no transfer learning involved) that you construct yourself, does it matter what the format of the labels are? In other words, using x1/x2/y1y2, x1/y1/x2/y2, center/width/length, or any other format matters? My understanding is that as long as there is consistency and that the coordinates reflect object location, format for labeling objects on an image will not matter. Please clarify. Thank you so much,

Ralf

BurhanQ · July 2, 2025, 10:59am

There is a specific YOLO format, you can read more about in the docs

The annotations in the text file must use image normalized coordinates, and bounding box coordinates:

class x_center y_center width height

rriedel · July 2, 2025, 1:26pm

Right, for YOLO. What I meant was for any other model. It seems to me that if you create a model that accepts some consistent coordinate representation of an object’s location, the model will learn what the object is after proper training. Am I correct? Thx,

Ralf

BurhanQ · July 3, 2025, 1:02am

I understand. The format representation of bounding boxes will vary by model, and might depend on the structure of the model as to what’s needed/best. In all likelihood, the coordinates will need to be normalized going into the model for training.

Topic		Replies	Views
Out of bounds coordinates Support yolo , question , support	6	80	April 15, 2025
YoloV10 bounding box format YOLO	2	298	October 11, 2024
Changing bounding boxes to polygons Discussion question	2	115	December 7, 2024
YOLO 12 Id'ing images not on my list Support yolo , question , support	15	150	April 1, 2025
Training custom dataset Discussion question , discussion	7	426	November 4, 2024

The need for specific bbox formatting

Related topics