Why does my model classify all pictures into one category?

Hi all,

I’m training a YOLO classifier for a challenging dataset, including grey pictures which can be divided into target (~7000) and non-target (~3000). I tried different lr0 parameters (0.01, 0.001, 0.0001, 0.00001, 0.000001), but all the models just classified pictures into target category.

I’ve checked several posts like What's the diferrence between lr/pg0, lr/pg1 & lr/pg2? · Issue #7424 · ultralytics/ultralytics · GitHub . And I think the issue might be related to too big lr/pg0 initially. Anyone knows how to set this value exactly the same as the lr0 I set? Anyone has other suggestions? I appreciate your help. Please see the screenshot of learning rate and accuracy, the model and training config below.

model0.yaml

# Simple classification model
nc: 2
backbone:

  • [-1, 1, Conv, [64, 7, 2, 3]]
  • [-1, 1, nn.MaxPool2d, [3, 2, 1]]
  • [-1, 4, C2f, [64, True]]
  • [-1, 1, Conv, [128, 3, 2]]
  • [-1, 8, C2f, [128, True]]
  • [-1, 1, nn.AdaptiveAvgPool2d, [1]]

head:

  • [-1, 1, Classify, [nc]]

config.yaml

project: DMV_Classification_YOLO
name: classify_model0_train_param4
epochs: 1024
imgsz: 1024
batch: 320
device: [0, 1, 2, 3]
cos_lr: True
optimizer: SGD
lr0: 0.00001
momentum: 0.9

Thought setting warmup_epochs as 0, I can keep lr/pg0 same as the other two. However, the model still learned nothing. Anyone knows why? My inputs are 4096*4096 0-255 grey images.

Setting warmup_epochs=0 is the right way to remove warmup, so if lr/pg0 still doesn’t “match” what you expect, it’s usually not the root cause. In your config the bigger issue is that the model can get “stuck” predicting the majority class (your target class is ~70% of the data), and with batch=320 plus lr0=1e-5 the updates are so small that it may effectively learn nothing.

Also, training a tiny custom classifier from scratch on subtle grayscale differences is much harder than fine-tuning a pretrained classifier.

I’d try this baseline first (and only then iterate), using a pretrained Ultralytics YOLO classifier and more reasonable optimization settings from the standard train arguments behavior:

from ultralytics import YOLO

model = YOLO("yolo11n-cls.pt")  # pretrained classifier
model.train(
    data="path/to/your_dataset",  # folder with train/val and class subfolders
    imgsz=512,                    # start smaller; increase later if needed
    batch=-1,                     # auto batch
    optimizer="AdamW",
    lr0=1e-3,
    epochs=100,
    cos_lr=True,
)

If that stops predicting only target, then your earlier runs were mostly optimization + imbalance effects. If it still predicts only one class, then it’s almost always a dataset issue (labels/folders swapped, non-stratified split, duplicates, or leakage). In that case, can you share your dataset folder structure (just the tree) and a screenshot of the val confusion matrix/results? That will usually pinpoint it quickly.

Hi Paula,

Thanks for your guidance. Fine-tuning a pretrained classifier achieved 0.92 precision. :grinning_face:

During this process, I found another subtle issue using different configuration:

pretrained model batch size GPU info training time
11n-cls -1 1 GPU; memory: 23/45G 128 epochs took 20 hours
11n-cls 640 4 GPUs; memory: 35/45G 80 epochs took 24 hours
11s-cls 360 4 GPUs; memory: 40/45G 128 epochs took 20 hours
11m-cls 160 4 GPUs; memory: 35/45G 128 epochs took 12 hours

Using more GPUs and bigger batch size did not speed up the training, which is anti-intuitive for me. I will look up more information related to this later. Thanks for your help.

Best wishes