Yolov8 architecture modifying for multi-label classification

Hey Ultralytics community!

I’m working on a Multi-label image Classification project using the Yolov8 classification model. To the best of my knowledge, the built-in Yolov8 classifier appears to be for single-label tasks and requires some modifications to its architecture. The key modifications needed are:

  1. Changing the final activation function in the head from softmax to sigmoid.
  2. Changing the loss function from Cross-Entropy to Binary Cross-Entropy (BCE).
  3. Setting the number of output units in the final layer to match my total number of classes.

My main questions are:

  • What is the best way to implement these architectural changes? Should I create a custom .yaml file for the model, and if so, which parameters should I modify?

  • Do I need to override the default train function and implement a custom training loop to use BCE loss? If yes, could you point me to an example or outline the steps?

  • Are there any existing examples or best practices within the community for adapting YOLOv8 for multi-label tasks?

I am open to any suggestions and support. Thanks in advance for your help!

You’re thinking about this in exactly the right way: for multi‑label classification you want independent probabilities per class (Sigmoid) and a BCE-style loss, rather than a Softmax + CrossEntropy setup as described in the Ultralytics glossaries on Sigmoid and Softmax.

A few key points specific to YOLOv8 classification:

  1. Final activation / architecture
    In the Ultralytics classifiers, the last layer is typically just a linear layer that outputs logits; the “Softmax” is applied conceptually at loss/metrics time rather than as an explicit module in the model. For multi‑label you actually do not want to add an explicit Sigmoid layer either; you normally keep raw logits and use BCEWithLogitsLoss, which internally combines Sigmoid + BCE. So architecturally you only need to ensure the last linear layer outputs nc units. The simplest way is to copy the yolov8*-cls.yaml, set nc to your number of labels, and train from that.

  2. Loss: CrossEntropy → BCE
    Out of the box, the YOLOv8 classification trainer assumes single‑label targets and uses a CrossEntropy loss. There is no current YAML flag to switch just the loss to BCE for multi‑label, so you have two realistic options:

  • Fork the repo and modify the classification trainer (the file that defines ClassificationTrainer) to:
    • replace CrossEntropy with torch.nn.BCEWithLogitsLoss
    • change the dataloader/labels to provide multi‑hot targets of shape [B, C] with 0/1 per class
  • Or, treat the YOLOv8 classifier as a regular PyTorch backbone and write a small custom training loop that:
    • loads the model (from ultralytics import YOLOmodel = YOLO('yolov8n-cls.yaml').model)
    • ensures the final linear layer has out_features = num_labels
    • uses BCEWithLogitsLoss and a custom dataset that returns multi‑label targets
  1. Examples / best practices
    There isn’t an official Ultralytics example for multi‑label classification with YOLOv8 today; people usually either:
  • fork and tweak the classification trainer as above, or
  • use YOLO as a feature extractor in a plain PyTorch multi‑label pipeline.

If you’re starting fresh, the same strategy applies to the newer Ultralytics YOLO11 classification models as well; they share the same principles around logits, Sigmoid/BCE, and multi‑label heads.

If you share how you’re currently preparing labels (single index vs multi‑hot), I can outline a minimal training loop tailored to that format.

Dear @pderrenger

Thank you so much for your incredibly detailed and helpful response. To answer your question, I will share your some details about my data format and labels:

I have moved away from the directory-based structure. I am now using a CSV file where each row contains an image_path and a list of multi-hot encoded labels. For example, for a dataset with 4 classes, a row looks like: ["path/to/img.jpg", [0, 1, 1, 0]], indicating that classes 2 and 3 are present in the image.

I would be immensely grateful if you could provide a sketch of the custom training loop tailored to this data format.

Thank you again for your invaluable time and expertise.

FWIW, you don’t technically need to modify anything with the model if you’re using yolo11-cls because in the results object, you can access top5 and top5conf for the probability of the top-5 classification results. See the docs here:

1 Like