Hi Everyone,
I’m just trying to add a fourth head to the YOLO11n model for processing a high-resolution feature map(P2) to detect very small objects in the existing model architecture. For this, I added a new extended feature map to the neck and added a new head to process this feature map. I tried this 2 ways, implementing the code directly in python and adding these changes in yolo11.yaml file.
Please find the implementation steps below.
- Extended the neck function by adding extra upsample module.
- Added a new head module to process the P2 feature map, it consists
of Conv layers, C3K module and a detect module for predictions. - Modified the forward pass method to include the new head.
- Load and train the model using the custom model by initialized and
loaded with pretrained weights.
Finally, when I try to load the model, getting the following error.
AttributeError: ‘CustomYOLO11n’ object has no attribute ‘extra_upsample’- in the code.
I tried all aspects, but no luck. It seems that DetectionModel class in YOLO11 dynamically builds the model based on the YAML configuration.
And I don’t understand how to register extra_upsample and p2_head modules into the model architecture.
Then I take a different approach, instead of subclassing DetectionModel, I modified YAML file to add a fourth head and loaded the model using modified YAML but still no luck. Please find the yaml below. Getting “RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 64 for tensor number 1 in the list.”
I’m doing this experiment for my project work, Advanced Driver Monitoring System. I need to add 4 new heads and modifying the neck for multitask learning.
Ultralytics YOLO11 object detection model with P3/8 - P5/32
Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. ‘model=yolo11n.yaml’ will call yolo11.yaml with scale ‘n’
[depth, width, max_channels]
n: [0.50, 0.25, 1024] # summary: 181 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPs
YOLO11n backbone
backbone:
[from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 2, C3k2, [256, False, 0.25]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 2, C3k2, [512, False, 0.25]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 2, C3k2, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 2, C3k2, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
- [-1, 2, C2PSA, [1024]] # 10
YOLO11n head
head:
-
[-1, 1, nn.Upsample, [None, 2, “nearest”]]
-
[[-1, 6], 1, Concat, [1]] # cat backbone P4
-
[-1, 2, C3k2, [512, False]] # 13
-
[-1, 1, nn.Upsample, [None, 2, “nearest”]]
-
[[-1, 4], 1, Concat, [1]] # cat backbone P3
-
[-1, 2, C3k2, [256, False]] # 16 (P3/8-small)
-
[-1, 1, Conv, [256, 3, 2]]
-
[[-1, 13], 1, Concat, [1]] # cat head P4
-
[-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)
-
[-1, 1, Conv, [512, 3, 2]]
-
[[-1, 10], 1, Concat, [1]] # cat head P5
-
[-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)
-
[-1, 1, nn.Upsample, [None, 2, “nearest”]] # 23 added
-
[[-1, 2], 1, Concat, [1]] # cat backbone P2
-
[-1, 3, C3k2, [128, False]] # 25 (P3/8-very small)
-
[-1, 1, Conv, [128, 3, 2]] # New Conv for P2
-
[[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)
-
[[23, 26, 27], 1, Detect, [nc]] # Detect(P2, P3, P4, P5)