Extending YOLO26 for custom multi-task architecture

Hi,

I would like to confirm my understanding of YOLO26’s architectural flexibility.

My goal is to build a single model that:

  • Detects one class (“car”) using bounding boxes

  • Performs segmentation on different classes (“door”, “window”, “damage”)

From my understanding, the standard YOLO26 segmentation configuration ties segmentation classes to detection classes, which would not support this setup directly.

Therefore, I believe the appropriate approach would be to:

  • Use the YOLO26 backbone + neck as a shared encoder

  • Attach independent task heads:

    • Detection head (car)

    • Segmentation head (parts)

    • Segmentation head (damage)

This would result in a true multi-task architecture with separate label spaces per head.

Could you please confirm whether this interpretation is correct, or if there is a simpler way to achieve this using YOLO26 without modifying the architecture?

Thank you so much

You could with significant changes in code since it would break a lot of Ultralytics assumptions.

Easier option is to have a mask for the whole car and train with overlap_masks=False so you can get the masks for other classes too.

1 Like