Hello everyone,
I’m currently training a YOLOv8 model using the following setup:
from ultralytics import YOLO
DATASET_DIR_PATH = 'path/to/dataset/root'
DATA_YAML_PATH = f"{DATASET_DIR_PATH}/data.yaml"
# Load Model
model = YOLO('yolov8n.pt')
# Train the model
results = model.train(
data = DATA_YAML_PATH,
epochs = 50,
)
And here’s my data.yaml
configuration:
names:
- cls1
- cls2
nc: 2
train: /path/to/train/images
val: /path/to/valid/images
My images are in a 1920x1080 resolution, and I need to train the model on images that are resized to a 1:1 aspect ratio (stretched). I am aware that the v8_transforms
function can achieve this with the stretch=True
parameter, but I’ve been facing issues when setting up a custom dataset to accommodate this.
Is there a more straightforward way to apply this resizing directly during the training process without modifying the actual dataset? Any tips or alternative methods that could be used directly in the training script would be greatly appreciated!
Thanks in advance for your help!
Just to be clear, you’re looking to have your model trained using images that are 1920 x 1920
where the image is stretched instead of padded (to maintain aspect ratio)? There is no stretch
argument available for the training method, but I think what you mean is that you’re manually doing something with the LetterBox
class.
It’s going to be helpful to understand why. Advice on asking for Support was written to explain the reason for this.
I’m not sure why you’d want to train on stretched images. If the images aren’t stretched like this in the final application, then you should not train using stretched images. If the images are stretched in your final application, then it raises additional questions:
- Why not correct the stretching?
- How are the images you’re using for training not stretched? If you have a custom application you need to train for, collecting images from where the model will be deployed is going to be the best way to train the model.
The training code will resize and pad your images as needed for training and inference. This is done automatically, so there’s not thing extra to do here for model compatibility sake. All of this comes back to the point of knowing why you’re looking to do this specifically?
Hi Burhan,
Thanks for your quick response.
I am training a Yolov8n object detection model to work with a depthai OAKD-Pro device. I want to run onboard inference and processing but due to some hardware limitations of the onboard pipeline I am stuck with two options:
- Inference on cropped-in images (loss of field of view)
- Inference on stretched images (trade off accuracy but I have a larger field of view which is important for my application)
[it is possible to run letterboxed inference onboard after an image manipulation, but that breaks part of the onboard functionality of my pipeline (depth estimation) and hurts the real-time performance, so I am kind of stuck with this option]
This is why I am trying to train a model with stretched images. Since that is also going to be my inference setting.
What I want to achieve is having the 1920x1080 images resized (stretched along vertical) to (640x640) when serving images to the model. I was trying to understand if there was a way to create a custom dataset with this as a preprocessing step. And if so, what would be the steps/which dataset modules would I need to use.
My other crude solution would be just to manually resize my dataset images/labels manually.
Looking forward to your insight.
Thanks,
Suhail
Hm, okay I can see why you might need to do this. I think (haven’t tested) that you could try creating a LetterBox
instance with arguments (new_shape=(640, 640), auto=False, scaleFill=True)
Since
new_shape
(Tuple[int, int]): Target size (height, width) for the resized image.
auto
(bool): If True, use minimum rectangle to resize. If False, use
new_shape directly.
scaleFill
(bool): If True, stretch the image to new_shape
without padding.
When auto=False
it will take the new_shape
value without checking for minimum rectangle sizing. Additionally, when scaleFill=True
there is no padding added to the image and it will be resized/stretched instead.
I would recommend testing this on a few images and viewing the results first to make sure it’s making the changes as you expect. Once confirmed, you can modify this line in the source code to use the arguments that work correctly (also I now see where you found the stretch
argument) for training.
Hopefully that helps! Let us know how it goes