Are instance segmentation (yolo) masks cropped by the bbox for inference?

Andrew_Qian · August 8, 2025, 3:17am

When the prototype masks are added up, are they cropped by the corresponding bounding box before being applied to the input image? I know they are cropped during training. But the boxes when I did cv2.rectangle was smaller than the masked area.

pderrenger · August 9, 2025, 12:39am

Hello Andrew_Qian,

That’s a great question. Yes, during the segmentation inference process, the generated masks are indeed cropped by their corresponding bounding boxes.

This is handled by the process_mask function, which takes the mask prototypes and bounding boxes as input. It then calls the crop_mask function to effectively zero out the mask area that falls outside of the predicted bounding box. This step occurs before the final mask is upsampled to the original image size.

You can review the implementation of both process_mask and crop_mask in the ops.py utility reference.

Hope this helps clarify the process

Andrew_Qian · August 9, 2025, 3:30am

Thanks for the response. My next question is, how come the box (xyxy) when drawn on the image does not match the mask?

I noticed the mask outputted is relative to the padded version of the image I passed in when I use the python API. So I unpadded the mask. This resulted in the correct mask in my image albeit one that reaches outside the region slightly, but the box when unpadded (and when padded) encapsulates the object, just like a normal detection would, but is smaller than the mask.

BurhanQ · August 9, 2025, 11:06am

How are you plotting the boxes/masks? If you are using the Results object method result.plot() the boxes and make should align properly. See the documentation on plotting results:

Andrew_Qian · August 11, 2025, 4:21pm

Upon further investigation this is what I discovered:

Results.xyxy gives you the correct unpadded box coordinates, results.plot is always correct. But the mask output (.masks.data) is not unpadded, so it needs correction to keep the mask in only the unpadded regions.

results[0].boxes.xyxy: unpadded and in correct original coordinates

results[0].masks.data: not unpadded and relative to input coordinates.

results[0].plot(): plotting is correct

But the plot function is not so useful for custom multi stage pipelines that use the output of one model for the input of another.

Toxite · August 11, 2025, 11:17pm

If you use retina_masks=True in model.predict(), result.masks.data should also be correctly resized to match the original image.

Topic		Replies	Views
In instance segmentation, if the segmented parts are discontinuous, how to put the contours of the discontinuous parts into the same mask YOLO yolo , segment , question , support	3	294	October 27, 2024
Training an instance segmentation model WITHOUT cropping. [Yolov11] Discussion discussion	1	19	September 6, 2025
Results of save_crop=True differs from result.boxes coordinates YOLO yolo , question , code	2	177	January 16, 2025
Larger obejcts detection boxes being cropped YOLO	5	49	August 20, 2025
How to draw masks of YoloV8n in iOS App YOLO	2	94	January 30, 2025

Are instance segmentation (yolo) masks cropped by the bbox for inference?

Related topics