Hi,
I would like to confirm my understanding of YOLO26’s architectural flexibility.
My goal is to build a single model that:
-
Detects one class (“car”) using bounding boxes
-
Performs segmentation on different classes (“door”, “window”, “damage”)
From my understanding, the standard YOLO26 segmentation configuration ties segmentation classes to detection classes, which would not support this setup directly.
Therefore, I believe the appropriate approach would be to:
-
Use the YOLO26 backbone + neck as a shared encoder
-
Attach independent task heads:
-
Detection head (car)
-
Segmentation head (parts)
-
Segmentation head (damage)
-
This would result in a true multi-task architecture with separate label spaces per head.
Could you please confirm whether this interpretation is correct, or if there is a simpler way to achieve this using YOLO26 without modifying the architecture?
Thank you so much