Generating text-based confusion matrix?

I’m training a model with a large number of classes - 100+. When it was 10 or 15, the confusion matrix was a very helpful tool in helping me evaluate the performance of the model across different classes.

Now that I have over 100 classes, the confusion matrix image that is generated by yolo is autoscaling so small I can’t see any of the data.

Is there a way to generate a text-based confusion matrix? In CSV format maybe?

Hi there! Great question, and I completely understand the challenge of visualizing a large confusion matrix with 100+ classes.

Yes, you can generate a text-based confusion matrix in YOLO and save it in a format like CSV for easier analysis. The ConfusionMatrix class in YOLO supports this functionality. Here’s how you can extract and save the matrix as text or CSV:

Code Example:

from ultralytics.utils.metrics import ConfusionMatrix

# Initialize the confusion matrix
nc = 100  # Number of classes
conf_matrix = ConfusionMatrix(nc=nc)

# Assuming you already have detections and ground truth data
# Update the confusion matrix with your data
# conf_matrix.process_batch(detections, ground_truth_boxes, ground_truth_classes)

# Save the confusion matrix to a CSV file
import numpy as np
import pandas as pd

# Convert the matrix to a DataFrame for better readability
df = pd.DataFrame(conf_matrix.matrix, 
                  columns=[f'Class_{i}' for i in range(nc + 1)], 
                  index=[f'Class_{i}' for i in range(nc + 1)])

# Save to CSV
df.to_csv('confusion_matrix.csv', index=True)
print("Confusion matrix saved as 'confusion_matrix.csv'")

Explanation:

  • ConfusionMatrix handles the creation and updating of the matrix.
  • The matrix can be converted into a pandas DataFrame for structured representation, where rows and columns represent predicted and actual classes, respectively.
  • You can save this DataFrame as a CSV file for easier inspection and analysis.

For more details on the ConfusionMatrix class, check out the documentation here.

This approach should make it much easier to analyze the performance of your model with a large number of classes. Let me know if you have further questions! :blush:

The output from validation contains the ConfusionMatrix class and you can use this to access the results of the confusion matrix directly after validation is complete.

from ultralytics import YOLO

model = YOLO("best.pt")  # assuming custom detection model

results = model.val("data.yaml")  # use your custom dataset YAML

results.confusion_matrix  # use to access methods and data from ConfusionMatrix class
1 Like