If I want to add custom attention blocks to the YOLO model for object detection, which model is better to experiment either the YOLOv11 or YOLOv12?
YOLO12 already uses attention
Tq @Toxite. I will try with YOLOv11 model then.
Sounds good — I’d also start from Ultralytics YOLO11 as the stable baseline and then add/ablate your custom attention blocks on top. YOLO12 is already attention-centric, but it’s a community model and can be less predictable to iterate on (training stability/memory/CPU throughput) as noted in the YOLO12 docs.
If you want to modify the architecture, the usual workflow is to copy a YOLO11 model YAML from ultralytics/cfg/models/11/, add your custom module, and train from that YAML:
from ultralytics import YOLO
model = YOLO("path/to/your_yolo11_custom.yaml")
model.train(data="coco8.yaml", imgsz=640, epochs=100)
If you share what attention block you’re adding (CBAM/SE/ECA/Transformer-style, etc.) and where you want to insert it (backbone vs neck), I can suggest the cleanest place in the YOLO11 graph to start.
tq @pderrenger. I would like to add neighborhood Attention and deformable attention in backbone for enhanced performance on UAV images.
@pderrenger can you suggest the place where can I insert these attentions in yolo11