Why does `Train()` only support square resolutoins

Why is training YOLO with non-square resolutions not possible, even though it is possible to inference on non-square resolutions.

Is this in inherit limitation of the training algorithms, from what I can tell reading the papers provided those dimensions are compatible with the network’s stride (x32 in all cases), it should be able to train

Minimal example

from ultralytics import YOLO

model = YOLO("yolo11n.pt")

results = model.train(
    data="coco8.yaml",
    epochs=100,
    imgsz=(960, 540),
)
...
WARNING ⚠️ updating to 'imgsz=960'. 'train' and 'val' imgsz must be an integer, while 'predict' and 'export' imgsz may be a [h, w] list or an integer, i.e. 'yolo export imgsz=640,480' or 'yolo export imgsz=640'
...

Thank you

I guess it’s just not a feature worth supporting. For inference, non-square images can speed things up. But during training, mosaic augmentation is used by default. It stitches multiple images together, which generates a square regardless of the originals. Using non-square images would only weaken mosaic augmentation and also cut down how much the model learns per step by shrinking the gradient.

1 Like