Hello!
I’ve been using YOLOv8 for a detection task. I want to process crops on-the-fly without saving crop images on the disk (it means i can`t use <save_crop=True> option).
However, the results of using the <save_crop=True> flag differ significantly from the bounding boxes in the result.boxes object.
You can see that in the attached images. On the left you can see results with save_crop=True, on the right you can see results of manually cropping original image with result.boxes coordinates:
This difference is crucial because the obtained crops from YOLO are then processed by another OCR model. The quality of the OCR is much better with the save_crop=True flag.
I’ve tried adding some pixels around the coordinates obtained from result.boxes, but the quality is still low. I think the problem is somehow related to resizing, but I can’t figure out how. I don`t resize original image before passing it down to YOLO.
This is how i get save_crop images:
results = yolo_model(image_name, save_crop=True, exist_ok=True, verbose=True, project="/work/", name="yolo_res", classes=[5,6,12])
This is how i get results.boxes images:
results = yolo_model(image_name, save_crop=False, save=False, verbose=True, classes=[5,6,12])
for bbox in yolo_results[0].boxes.xyxy.numpy():
x_min, y_min, x_max, y_max = bbox
cropped_image = page_name.crop((x_min, y_min, x_max, y_max))
Please advise me on what could be the source of this problem and how I can fix it. TIA!