When I run model.val() before training the model with my dataset the confusion matrix contains several classes that are not specified in the dataset’s YAML file. This issue #16695 explains why the background class is present, but why are all these other classes (person, bicycle, car etc) present?
Here’s my code (which I’m running in Google Colab):
!pip install ultralytics
!pip install roboflow
from roboflow import Roboflow
rf = Roboflow(api_key=[api_key_goes_here])
project = rf.workspace("conveyor-550m0").project("conveyor-hhrzw")
version = project.version(3)
dataset = version.download("yolov11")
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
results = model.val(data="/content/conveyor-3/data.yaml")
print(results.confusion_matrix.to_df()) # First confusion matrix
train_results = model.train(data="/content/conveyor-3/data.yaml", epochs=1)
results = model.val()
print(results.confusion_matrix.to_df()) # Second confusion matrix
The YAML file (/content/conveyor-3/data.yaml) contains three classes:
nc: 3
names: ['cardboard box', 'conveyor', 'kartonbox']
The first confusion matrix (before training) looks like this:
┌────────────┬────────┬─────────┬─────┬───┬────────────┬────────────┬────────────┬────────────┐
│ Predicted ┆ person ┆ bicycle ┆ car ┆ … ┆ teddy_bear ┆ hair_drier ┆ toothbrush ┆ background │
│ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
╞════════════╪════════╪═════════╪═════╪═══╪════════════╪════════════╪════════════╪════════════╡
│ person ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 5.0 │
│ bicycle ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ car ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ motorcycle ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ airplane ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │
│ scissors ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ teddy_bear ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ hair_drier ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ toothbrush ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ background ┆ 152.0 ┆ 34.0 ┆ 5.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
└────────────┴────────┴─────────┴─────┴───┴────────────┴────────────┴────────────┴────────────┘
The second confusion matrix (after training) looks like this:
┌───────────────┬───────────────┬──────────┬───────────┬────────────┐
│ Predicted ┆ cardboard_box ┆ conveyor ┆ kartonbox ┆ background │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
╞═══════════════╪═══════════════╪══════════╪═══════════╪════════════╡
│ cardboard_box ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ conveyor ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ kartonbox ┆ 0.0 ┆ 0.0 ┆ 0.0 ┆ 0.0 │
│ background ┆ 167.0 ┆ 46.0 ┆ 5.0 ┆ 0.0 │
└───────────────┴───────────────┴──────────┴───────────┴────────────┘
As expected, the second confusion matrix contains the three classes from the dataset’s YAML file and also the background class. I expected that the first confusion matrix would also only contain those four classes. Even before training, I expected that model validation would use the classes listed in the dataset’s YAML file.
Where does model.val() get all those other classes (person, bicycle, car etc) from?