Discussion on precision and recall values of YOLOv8

Wili_Akbar_Nugraha · July 4, 2024, 8:57am

I would like to ask and confirm whether the precision and recall formula in YOLOv8 uses this formula?
I have a general precision and recall formula:
Precision: TP/TP+FP
Recall: TP/TP+FN
My Case is oil spill object segmentation from SAR images with confusion matrix result:

Based on the confusion matrix that I have from evaluating the YOLOv8 model using the test dataset. Then the results of precision and recall accuracy based on the formula above are:
Precision: TP/TP+FP = 149/149+107 = 149/256 = 0.5820
Recall: TP/TP+FN = 149/149+0= 149/149= 1

However, the precision and recall results provided by YOLOv8-x are different from the formulas I presented, which are as follows:
Precision: 0.564
Recall: 0.51
Why is this happening? Thank you for your attention

pderrenger · July 8, 2024, 1:37pm

Hi there!

Thank you for your detailed question and for sharing your confusion matrix results. It’s great to see your interest in understanding the precision and recall metrics for your oil spill object segmentation task using YOLOv8.

You are correct in your general formulas for precision and recall:

Precision = TP / (TP + FP)
Recall = TP / (TP + FN)

However, the discrepancy you’re observing between your calculations and the results provided by YOLOv8 could be due to several factors:

IoU Thresholds: YOLOv8 typically uses Intersection over Union (IoU) thresholds to determine true positives. The default IoU threshold is 0.5, meaning that a predicted bounding box is considered a true positive if its IoU with the ground truth box is greater than or equal to 0.5. If your evaluation criteria differ, this could explain the variation in precision and recall values.
Confidence Thresholds: YOLOv8 also applies a confidence threshold to filter out low-confidence predictions. Predictions below this threshold are not considered, which can affect the counts of TP, FP, and FN.
Class-Specific Metrics: If your dataset contains multiple classes, the precision and recall values might be averaged across classes, which could lead to differences from class-specific calculations.

To better understand and align your evaluation with YOLOv8’s metrics, you can refer to the YOLOv8 Validation Documentation for detailed information on how these metrics are computed.

Additionally, you can use the following code snippet to validate your model and inspect the detailed metrics:

from ultralytics import YOLO

# Load your trained model
model = YOLO('path/to/your/model.pt')

# Validate the model on your test dataset
metrics = model.val(data='path/to/your/test_dataset.yaml')

# Access precision and recall
precision = metrics.box.map50  # Precision at IoU=0.50
recall = metrics.box.map  # Recall at IoU=0.50:0.95
print(f'Precision: {precision}')
print(f'Recall: {recall}')

For a deeper dive into performance metrics and their interpretation, you might find this Performance Metrics Guide helpful.

I hope this clarifies the differences you’re seeing. If you have any further questions or need more assistance, feel free to ask!

Topic		Replies	Views
How to Accurately Calculate Object Detection Metrics Using Ultralytics? Discussion question	1	220	September 16, 2024
Validation with YOLOv8 segmentation YOLO	8	74	April 28, 2025
Issues with Low mAP Scores During COCO Evaluation Using YOLOv8 ONNX Models Support question , support , code	4	372	January 9, 2025
How to use just the evaluation pipeline on predictions made by another model? YOLO yolo , question , support	3	142	March 14, 2025
[Unofficial] Benchmark Results (How fast can you YOLO) Hardware yolov8 , desktop , benchmark	4	762	August 14, 2024

Discussion on precision and recall values of YOLOv8

Related topics