I need some help from the YOLO community. I have worked on a couple of YOLO projects previously that all worked quite well but this project is showing quite poor results.
Our goal is to detect “Cocos” on Brazil nut trees (Castañas) from drone images. Cocos are large coconut like fruits on Brazil nut trees that hold all the Brazil nuts. We hope to connect the amount of detections to harvest data taken from the previous 7 years for individual trees.
The dataset is at here.
When roboflow trained the model it gave me these results:
mAP 7.1% Precision 12.8% Recall 1.8%
I have a lot of questions about the best approach for detection of these coco’s. All of our images are taken from the same elevation so all of the detections will be around the same size.
The coco’s show in the drone images in many different ways from full visible coco’s to partially hidden coco’s by leaves to coco’s a little deeper in the tree canopy that show as a little darker but still noticeable to the human eye as well as some around similar looking tree trucks and bark. Here are a few examples of annotations and an overall view of all of the cocos on one drome image:
We don’t necessarily need an accurate count of all visible coco’s exactly but we do need to be able to make a relationship between YOLO coco detection and harvest volume per tree.
I have generated a lot of questions so I hope others in the community will find this helpful as well.
What do you think is the best way to annotate the coco’s to get a reliable result?
Can we remove partially covered coco’s from the annotations and try to only detect only full visible coco’s or will that confuse the model in some way?
Does the fact that each annotation has very few pixels in relation to the image size make the detection of coco’s in these images difficult or unlikely?
Should I focus on including partial objects (e.g., partially visible cocos) in my annotations, or only fully visible ones?
Should I leave some room between the edges of the coco’s and the annotations or should the annotations fit very tightly around the coco’s without touching the edges of them?
What is the best practice for detection such small objects in an image?
How to best deal with small clusters of coco’s?
Would using segmentation masks instead of bounding boxes improve detection for small objects like coco’s?
Should I include a significant number of negative samples (images without coco’s) in my dataset?
What augmentations are most effective for small object detection of coco’s? (I assume we should not use any color changes for example)
Does the dataset require a minimum number of examples per object class for effective detection? (Do we simply need more examples annotated)
Is it better to downscale or crop images for training?
Are there specific hyperparameters (e.g., anchor box sizes) that should be tuned for small object detection?
How should I adjust the learning rate and batch size for a dataset with many small objects?
Would a larger model variant (e.g., YOLOv11x) perform better for detecting small objects compared to smaller ones (e.g., YOLOv11s)?
Are there specific YOLO configurations optimized for small object detection?
Is my dataset’s Ground Sampling Distance (2.1 cm), the resolution or size of each pixel, appropriate for detecting small objects like cocos?
Any help and guidance you can provide would be greatly appreciated!