Modifying YOLOv11 architecture to add custom modules

I’m interested in modifying the YOLOv11 architecture to experiment with new modules in the backbone and neck while keeping the rest of the framework (training, evaluation, export) intact.

Any tips or examples for safely extending the YOLOv11 structure would be greatly appreciated.

Thanks!
Vaibhav Panchal

Great question, Vaibhav! You can extend YOLO11 cleanly via a custom model YAML and (optionally) a small module registration, keeping train/val/export unchanged.

Quick path

  • Define your blocks, register them once, then reference by name in YAML. The end-to-end flow is unchanged.

Minimal example

  1. Add a custom block (dev install recommended):
# ultralytics/nn/modules/block.py
import torch.nn as nn
from ultralytics.nn.modules.conv import Conv

class MyBlock(nn.Module):
    def __init__(self, c1, c2):
        super().__init__()
        self.m = nn.Sequential(Conv(c1, c2, 3, 1), Conv(c2, c2, 3, 1))
    def forward(self, x):
        return self.m(x)

Expose it:

# ultralytics/nn/modules/__init__.py
from .block import MyBlock

(If your block needs c1 injected automatically, add a tiny clause in parse_model() to prepend c1 from ch[f].)

  1. Use it in YAML:
# custom11.yaml
nc: 80
backbone:
  - [-1, 1, Conv, [64, 3, 2]]
  - [-1, 1, MyBlock, [128]]
  - [-1, 3, C2f, [128, True]]
head:
  - [[2], 1, Detect, [nc]]
  1. Train/validate/export as usual:
from ultralytics import YOLO
m = YOLO("custom11.yaml")
m.info()  # check non-zero FLOPs
m.train(data="coco8.yaml", epochs=50)
m.export(format="onnx")

Tips

  • Start simple and add blocks incrementally; check channels at each concat/skip.
  • You can drop in TorchVision backbones via the TorchVision and Index modules without writing code; the YAML patterns in the docs show how to select multi-scale features.

Helpful references

  • The step-by-step process (YAML structure, module resolution, custom module hooks) is summarized in the Model YAML Configuration guide: see the sections on layer format, module resolution, and Custom Module Integration.
  • For deeper control over training loops only if needed, see the Advanced customization guide to override trainers while keeping the YOLO API.

If you share a small YAML + block snippet, I’m happy to sanity-check channels and the parse_model() args.

Thank you so much for your support and guidance!
I really appreciate the help and valuable feedback you’ve given me throughout the YOLO development process.
Your time and expertise made a huge difference — truly grateful for it!

Small test YAML for YOLO block sanity check

Parameters

nc: 15
depth_multiple: 0.33
width_multiple: 0.50

Backbone

backbone:

  • [-1, 1, Conv, [64, 3, 2]] # 0
  • [-1, 1, Conv, [128, 3, 2]] # 1
  • [-1, 3, C3k2_RE, [128, True]] # 2 custom RC block

Head

head:

  • [-1, 1, CARAFE, [128, 3, 5, 2]] # 3 upsample block
  • [[-1, 1], 1, Concat, [1]] # 4 concat example
  • [-1, 1, Conv, [128, 3, 1]] # 5 final conv

the block code:

class C3k2_RE(nn.Module):
    def __init__(self, c1, c2, shortcut=True):
        super().__init__()
        self.cv1 = Conv(c1, c2, 1, 1)
        self.cv2 = Conv(c2, c2, 3, 1)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        y = self.cv2(self.cv1(x))
        return x + y if self.add else y

Would appreciate if you could sanity-check the channel flow and how it’s parsed by parse_model().

Thanks a lot for your help!