Random Cropping as an augmentation

Hey guys,
How can I use an augmentation of random cropping?
Thanks

You can use mosaic which is enabled by default.

Thanks. Used it.
As far as I understand, it just resizes the image into four tiles and combine them as one puzzle.
I want to randomly crop patches from the original image without resizing them.

It crops a portion of the image during the tiling process. You can also use scale and translate augmentations to achieve a combined random crop effect.

1 Like

Just updating that so far, by looking at the mosaic batches, none of them showed a random crop from the original image. All four of them were resized…

Btw, the problem with mosaic is that bounding boxes might be cut off in the new image, resulting in false detections, in my opinion, whereas some objects require the entire bounding box’s context to better detect the object.

By resizing, do you mean the original image is “squished” to fit the tile? Because Ultralytics never does that. It never squishes the image. Not even for classification. When the image doesn’t fit the tile after resizing, it will crop a square portion out of it. So if the the images are originally a square, then you might not see this effect. In that case, you should increase scale and translate augmentation.

mosaic as it is currently implemented was found empirically to perform better than other augmentation techniques and hence it’s used

You can take care of the bounding boxes while cropping, but carefully checking for that for every image introduces additional overhead because it’s almost like a “search” operation to find the right random crop during training which will increase training time.

1 Like

Good takes. I’ll try to clarify myself:

Just to use a common ground: Let’s say that we have an image with resolution of 4K (3840 x 2160 pixels).
Then:

  • crop = selecting a smaller patch (say 640x640) out of the original image (4K)
  • resize = resize the resolution from 3840 x 2160 to 1137x640 (with related proportions), then crop the 640x640 in the center, getting a 640x640 patch without distortions.
  • squish = resize the resolution from 3840 x 2160 to 640x640 (every dimensions is squished without related proportions), leading to some smearing.

My problem with mosaic is that it does this:
Resize > Combine (into a mosaic).

This is a problem because the resizing might lose some critical super small details that could be seen with higher resolution.

I want to apply only a Random Crop augmentation on the original image.

I’d use an original image of 4K, and randomly crop big enough patches out of it (without downgrading the resolution).

It looks like this:
https://pytorch.org/vision/0.13/auto_examples/plot_transforms.html#randomcrop

You suggested Scale and Translate, I’ll try it now, but now sure that it does exactly what I need… I might find out that Scale is some sort of resizing?

Here’s the list of augmentations from the documentations:

Ah ok. Yes, it’s resized in the dataloader while loading the image. I guess it helps keep the memory requirement low.

1 Like

I created a branch that uses full resolution during augmentations.

You can install it by

git clone https://github.com/Y-T-G/ultralytics
cd ultralytics
git checkout hires_train
pip install .

Pass hi_res=True during training in model.train()

From my brief testing, the results were worse. But you can try it, especially with scale and translate.

2 Likes

Wow man, impressive work! I’m deeply grateful for your support. I thought you’d guide me where I may modify things in the code, and I’d deep dive into it.

So once again, thank you! Appreciating it very much. I’ll try it out :slight_smile:

1 Like