Yolo26 seg p2 available? possible?

krstm · January 17, 2026, 4:15am

Hello!

I’ve come into a challenge in that I’d like to segment some objects in a small image - like 200x200 px. I wanted to use this as higher accuracy segmentation model. Ie: I segment with default settings on a large image, and if the conf score is low on a segmented area, I make a cutout of that segment and run a “specialist” routine to confirm whether it is the object of interest or not.

I read up on modifying the YAML file to accomplish this, but it seems like everything I try ends up with a ”cannot multiply mat1 and mat2” error with some sizes like 5x1554 and 6124x192 listed (numbers from memory but you get the idea). I’ve tried fumbling through with some coding agents as well to see if I’m missing something obvious without any success other than a change in those mat1/mat2 dimensions.

I want to remove P5 and add P2 to the detection head so that I’m using only P2, P3, and P4. or if necessary then P2, P3, P4, and P5 (like the detection model has a yaml option for)

I’m beginning to think this isn’t possible without modifying Segment26 itself - and I don’t want to go down that path. Any suggestions here? Fingers crossed I’m missing something obvious.

Toxite · January 17, 2026, 2:05pm

P2 adds a stride that makes the masks twice as large than the hardcoded assumption in Ultralytics, which is why it doesn’t work. You need to modify the Ultralytics code to change that hardcoded value

Like in this PR:

github.com/ultralytics/ultralytics

Determine mask scaling factor from stride during validation

main ← seg-stride-fix

opened 12:10AM - 10 Jan 26 UTC

Y-T-G

+6 -2

Errors with P2 segmentation model due to hardcoded scaling factor: ``` Traceba…ck (most recent call last): File "/opt/conda/bin/yolo", line 8, in <module> sys.exit(entrypoint()) ^^^^^^^^^^^^ File "/ultralytics/ultralytics/cfg/__init__.py", line 987, in entrypoint getattr(model, mode)(**overrides) # default args from model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ultralytics/ultralytics/engine/model.py", line 774, in train self.trainer.train() File "/ultralytics/ultralytics/engine/trainer.py", line 243, in train self._do_train() File "/ultralytics/ultralytics/engine/trainer.py", line 478, in _do_train self.metrics, self.fitness = self.validate() ^^^^^^^^^^^^^^^ File "/ultralytics/ultralytics/engine/trainer.py", line 708, in validate metrics = self.validator(self) ^^^^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/ultralytics/ultralytics/engine/validator.py", line 223, in __call__ self.update_metrics(preds, batch) File "/ultralytics/ultralytics/models/yolo/detect/val.py", line 184, in update_metrics **self._process_batch(predn, pbatch), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ultralytics/ultralytics/models/yolo/segment/val.py", line 168, in _process_batch iou = mask_iou(batch["masks"].flatten(1), preds["masks"].flatten(1).float()) # float, uint8 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/ultralytics/ultralytics/utils/metrics.py", line 162, in mask_iou intersection = torch.matmul(mask1, mask2.T).clamp_(0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: mat1 and mat2 shapes cannot be multiplied (8x21504 and 86016x12) ``` **MRE** `yolo11-seg-p2.yaml` ```yaml # Model nc: 80 strides: [4, 8, 16, 32] scales: n: [0.50, 0.25, 1024] s: [0.50, 0.50, 1024] m: [0.50, 1.00, 512] l: [1.00, 1.00, 512] x: [1.00, 1.50, 512] backbone: # Stem - [-1, 1, Conv, [64, 3, 2]] # (1, 64, 320, 320) - 0 # P2 Backbone - [-1, 1, Conv, [128, 3, 2]] # (1, 128, 160, 160) - 1 - [-1, 2, C3k2, [256, False, 0.25]] # (1, 256, 160, 160) - 2 # P3 Backbone - [-1, 1, Conv, [256, 3, 2]] # (1, 256, 80, 80) - 3 - [-1, 2, C3k2, [512, False, 0.25]] # (1, 512, 80, 80) - 4 # P4 Backbone - [-1, 1, Conv, [512, 3, 2]] # (1, 512, 40, 40) - 5 - [-1, 2, C3k2, [512, True]] # (1, 512, 40, 40) - 6 # P5 Backbone - [-1, 1, Conv, [1024, 3, 2]] # (1, 512, 20, 20) - 7 - [-1, 2, C3k2, [1024, True]] # (1, 512, 20, 20) - 8 - [-1, 1, SPPF, [1024, 5]] # (1, 512, 20, 20) - 9 - [-1, 2, C2PSA, [1024]] # (1, 512, 20, 20) - 10 head: # Top-Down Path (Feature Pyramid) # P4 Head - [-1, 1, nn.Upsample, [None, 2, nearest]] # (1, 512, 40, 40) - 11 - [[-1, 6], 1, Concat, [1]] # (1, 1024, 40, 40) - 12 - [-1, 2, C3k2, [512, False]] # (1, 512, 40, 40) - 13 # P3 Head - [-1, 1, nn.Upsample, [None, 2, nearest]] # (1, 512, 80, 80) - 14 - [[-1, 4], 1, Concat, [1]] # (1, 1024, 80, 80) - 15 - [-1, 2, C3k2, [256, False]] # (1, 256, 80, 80) - 16 # P2 Head - [-1, 1, nn.Upsample, [None, 2, nearest]] # (1, 256, 160, 160) - 17 - [[-1, 2], 1, Concat, [1]] # (1, 512, 160, 160) - 18 - [-1, 2, C3k2, [128, False]] # (1, 128, 160, 160) - 19 # Bottom-Up Path (PAN-FPN) # P3 - [-1, 1, Conv, [128, 3, 2]] # (1, 128, 80, 80) - 20 - [[-1, 16], 1, Concat, [1]] # (1, 384, 80, 80) - 21 - [-1, 2, C3k2, [256, False]] # (1, 256, 80, 80) - 22 # P4 - [-1, 1, Conv, [256, 3, 2]] # (1, 256, 40, 40) - 23 - [[-1, 13], 1, Concat, [1]] # (1, 768, 40, 40) - 24 - [-1, 2, C3k2, [512, False]] # (1, 512, 40, 40) - 25 # P5 - [-1, 1, Conv, [512, 3, 2]] # (1, 512, 20, 20) - 26 - [[-1, 10], 1, Concat, [1]] # (1, 1024, 20, 20) - 27 - [-1, 2, C3k2, [1024, True]] # (1, 512, 20, 20) - 28 # Segmentation Head (P2, P3, P4, P5) - [[19, 22, 25, 28], 1, Segment, [nc, 32, 256]] # 29 ``` ``` yolo train model=yolo11n-seg-p2.yaml data=coco128-seg.yaml epochs=30 batch=2 ``` ## 🛠️ PR Summary <sub>Made with ❤️ by [Ultralytics Actions](https://www.ultralytics.com/actions)</sub> ### 🌟 Summary Fixes segmentation validation mask scaling by deriving the mask/image scaling factor from the model stride (instead of a hardcoded value) 🧩 ### 📊 Key Changes - Updates `Validator` to set `self.stride = model.stride` after model warmup to ensure stride is available and accurate during validation. - In segmentation validation, replaces hardcoded proto-to-image scaling (`4 * x`) with a stride-based scaling factor derived from `self.stride`. - Adjusts ground-truth mask resizing logic to compute expected mask size using stride, improving consistency across different model/export configurations. ### 🎯 Purpose & Impact - Improves correctness of mask resizing/scaling during validation for segmentation models, especially when stride differs from the previously assumed factor. - Reduces the chance of mismatched mask shapes and associated metric inaccuracies when validating across model variants/exports. - Makes validation behavior more robust and model-aware, helping users get more reliable segmentation metrics 📈

The PR still doesn’t have a fix because there’s no easy way around it. The default hardcoded downsample value is 4. But it needs to be 2 for P2 model. You can change the lines in the diff to 2 if you want an easy way to just make the P2 work. But of course it would break non-P2 models.

BurhanQ · January 18, 2026, 2:32pm

What you describe sounds like it could be done using a simple conditional check during inference. If the first pass detects objects, but one or more objects are below a specified threshold or if no objects are detected, it would trigger the conditional path. On the conditional path you’d call a function with the image, and center points of any low confidence detections (empty list/array otherwise) to do a second inference pass. The function would then could either use the same model or load a new model, slice the image into the necessary tiles, and then perform inference on each tile.

This wouldn’t require any modification to the model, and should be relatively straightforward to implement. It would give you an “fast” path, when all detections are above a given threshold (likely the common route), and a “slow” path, running a second inference on the tiles. The slow path could even use SAHI for tiled inference with YOLO, which could add more latency but increase the overall accuracy of the second pass.

krstm · January 19, 2026, 2:28am

Indeed I happened to hear of SAHI just yesterday and I’ll see how well it works.
The issue I’m having is that I havent been able to train a “specialized” model thats better than the model thats running for the full image. For example, I’ve tried training a separate model using cutouts of the objects as training data with various parameters without any luck. But with small objects (~220px), P4 P5 are kind of irrelevant because its dividing up the image too much. That’s why I was hoping to get a P2 head in there.

Would you have suggestions or ideas on how to pursue this “specialized” model (other than SAHI obviously)

krstm · January 19, 2026, 2:48am

It looks like the PR is addressing autodetecting the stride length rather than replacing it with 2. Am I missing something? Wouldn’t this change work?

Toxite · January 19, 2026, 10:38pm

It’s incomplete.

However, I found a different workaround. You can just swap the layers:

github.com/ultralytics/ultralytics

yolov8m-seg-p2 get error of intersection = torch.matmul(mask1, mask2.T).clamp_(0) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x22400 and 89600x1)

opened 12:30PM - 19 Jan 26 UTC

closed 01:37PM - 19 Jan 26 UTC

libfetion

bug segment

### Search before asking - [x] I have searched the Ultralytics YOLO [issues](ht…tps://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### Ultralytics YOLO Component _No response_ ### Bug git clone the last version commit 79e318cb43841931e244f829af56b25aa18a6d22 (HEAD -> main, tag: v8.4.6, origin/main, origin/HEAD) yolov8m-seg-p6.yaml works well! yolov8m-seg-p2.yaml will get error as following python3 train.py Transferred 319/657 items from pretrained weights WARNING ⚠️ 'crop_fraction' is deprecated and will be removed in the future. WARNING ⚠️ 'save_hybrid' is deprecated and will be removed in the future. WARNING ⚠️ 'label_smoothing' is deprecated and will be removed in the future. Ultralytics 8.4.6 🚀 Python-3.10.12 torch-2.9.0.dev20250811+cu128 CUDA:1 (NVIDIA GeForce RTX 5090, 32109MiB) engine/trainer: agnostic_nms=False, amp=True, angle=1.0, augment=False, auto_augment=autoaugment, batch=36, bgr=0.0, box=7.8, cache=disk, cfg=./config.yaml, classes=None, close_mosaic=10, cls=0.3, compile=False, conf=None, copy_paste=0.3, copy_paste_mode=flip, cos_lr=True, cutmix=0.0, data=/opt/docker/yolo8m_seg_test/dataset.yaml, degrees=0.0, deterministic=True, device=1, dfl=1.3, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=500, erasing=0.3, exist_ok=False, fliplr=0.5, flipud=0.2, format=torchscript, fraction=1.0, freeze=None, half=True, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=[448, 768], int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.008, lrf=0.12, mask_ratio=4, max_det=300, mixup=0.15, mode=train, model=./yolov8m-seg-p2.yaml, momentum=0.937, mosaic=0.5, multi_scale=False, name=train5, nbs=64, nms=False, opset=None, optimize=False, optimizer=MuSGD, overlap_mask=True, patience=50, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=/opt/docker/yolo8m_seg_test/model, rect=True, resume=False, retina_masks=False, rle=1.0, save=True, save_conf=False, save_crop=False, save_dir=/opt/docker/yolo8m_seg_test/model/train5, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=False, single_cls=False, source=None, split=val, stream_buffer=True, task=segment, time=None, tracker=bytetrack.yaml, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.08, warmup_epochs=5, warmup_momentum=0.75, weight_decay=0.0008, workers=16, workspace=4 Overriding model.yaml nc=80 with nc=2 from n params module arguments 0 -1 1 1392 ultralytics.nn.modules.conv.Conv [3, 48, 3, 2] 1 -1 1 41664 ultralytics.nn.modules.conv.Conv [48, 96, 3, 2] 2 -1 2 111360 ultralytics.nn.modules.block.C2f [96, 96, 2, True] 3 -1 1 166272 ultralytics.nn.modules.conv.Conv [96, 192, 3, 2] 4 -1 4 813312 ultralytics.nn.modules.block.C2f [192, 192, 4, True] 5 -1 1 664320 ultralytics.nn.modules.conv.Conv [192, 384, 3, 2] 6 -1 4 3248640 ultralytics.nn.modules.block.C2f [384, 384, 4, True] 7 -1 1 1991808 ultralytics.nn.modules.conv.Conv [384, 576, 3, 2] 8 -1 2 3985920 ultralytics.nn.modules.block.C2f [576, 576, 2, True] 9 -1 1 831168 ultralytics.nn.modules.block.SPPF [576, 576, 5] 10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1] 12 -1 2 1993728 ultralytics.nn.modules.block.C2f [960, 384, 2] 13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1] 15 -1 2 517632 ultralytics.nn.modules.block.C2f [576, 192, 2] 16 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 17 [-1, 2] 1 0 ultralytics.nn.modules.conv.Concat [1] 18 -1 2 129792 ultralytics.nn.modules.block.C2f [288, 96, 2] 19 -1 1 83136 ultralytics.nn.modules.conv.Conv [96, 96, 3, 2] 20 [-1, 15] 1 0 ultralytics.nn.modules.conv.Concat [1] 21 -1 2 462336 ultralytics.nn.modules.block.C2f [288, 192, 2] 22 -1 1 332160 ultralytics.nn.modules.conv.Conv [192, 192, 3, 2] 23 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1] 24 -1 2 1846272 ultralytics.nn.modules.block.C2f [576, 384, 2] 25 -1 1 1327872 ultralytics.nn.modules.conv.Conv [384, 384, 3, 2] 26 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1] 27 -1 2 4207104 ultralytics.nn.modules.block.C2f [960, 576, 2] 28 [18, 21, 24, 27] 1 3349656 ultralytics.nn.modules.head.Segment [2, 32, 192, 16, None, [96, 192, 384, 576]] YOLOv8m-seg-p2 summary: 236 layers, 26,105,544 parameters, 26,105,528 gradients, 187.3 GFLOPs Transferred 649/657 items from pretrained weights Freezing layer 'model.28.dfl.conv.weight' AMP: running Automatic Mixed Precision (AMP) checks... AMP: checks passed ✅ WARNING ⚠️ updating to 'imgsz=768'. 'train' and 'val' imgsz must be an integer, while 'predict' and 'export' imgsz may be a [h, w] list or an integer, i.e. 'yolo export imgsz=640,480' or 'yolo export imgsz=640' train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 10832.8±2399.5 MB/s, size: 411.7 KB) train: Scanning /opt/docker/yolo8m_seg_test/data/project-12/labels.cache... 5407 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 5407/5407 1.7Git/s 0.0s train: Caching images (19.5GB Disk): 100% ━━━━━━━━━━━━ 5407/5407 198.4Kit/s 0.0s albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8)) WARNING ⚠️ 'rect=True' is incompatible with DataLoader shuffle, setting shuffle=False val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1415.0±180.2 MB/s, size: 733.0 KB) val: Scanning /opt/docker/yolo8m_seg_test/data/project-8/labels.cache... 630 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 630/630 31.8Mit/s 0.0s val: Caching images (2.3GB Disk): 100% ━━━━━━━━━━━━ 630/630 75.1Kit/s 0.0s Plotting labels to /opt/docker/yolo8m_seg_test/model/train5/labels.jpg... optimizer: MuSGD(lr=0.008, momentum=0.937) with parameter groups 0 weight(decay=0.0), 0 weight(decay=0.0009000000000000001), 0 bias(decay=0.0) Image sizes 768 train, 768 val Using 16 dataloader workers Logging results to /opt/docker/yolo8m_seg_test/model/train5 Starting training for 500 epochs... Epoch GPU_mem box_loss seg_loss cls_loss dfl_loss sem_loss Instances Size 1/500 26.2G 2.871 4.379 2.549 3.52 0 7 768: 100% ━━━━━━━━━━━━ 151/151 2.3it/s 1:07 Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 0% ──────────── 0/9 0.3s Traceback (most recent call last): File "/opt/docker/yolo8m_seg_test/train.py", line 7, in <module> model.train(cfg='./config.yaml') File "/opt/docker/yolo/ultralytics/engine/model.py", line 774, in train self.trainer.train() File "/opt/docker/yolo/ultralytics/engine/trainer.py", line 244, in train self._do_train() File "/opt/docker/yolo/ultralytics/engine/trainer.py", line 487, in _do_train self.metrics, self.fitness = self.validate() File "/opt/docker/yolo/ultralytics/engine/trainer.py", line 715, in validate metrics = self.validator(self) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 120, in decorate_context return func(*args, **kwargs) File "/opt/docker/yolo/ultralytics/engine/validator.py", line 223, in __call__ self.update_metrics(preds, batch) File "/opt/docker/yolo/ultralytics/models/yolo/detect/val.py", line 184, in update_metrics **self._process_batch(predn, pbatch), File "/opt/docker/yolo/ultralytics/models/yolo/segment/val.py", line 168, in _process_batch iou = mask_iou(batch["masks"].flatten(1), preds["masks"].flatten(1).float()) # float, uint8 File "/opt/docker/yolo/ultralytics/utils/metrics.py", line 163, in mask_iou intersection = torch.matmul(mask1, mask2.T).clamp_(0) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x22400 and 89600x1) ### Environment root@<cdkj-192.168.2.191>:/opt/docker/yolo8m_seg_test#yolo checks Ultralytics 8.4.6 🚀 Python-3.10.12 torch-2.9.0.dev20250811+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32101MiB) Setup complete ✅ (28 CPUs, 94.0 GB RAM, 568.9/3664.0 GB disk) OS Linux-6.2.0-39-generic-x86_64-with-glibc2.35 Environment Linux Python 3.10.12 Install git Path /opt/docker/yolo/ultralytics RAM 94.05 GB Disk 568.9/3664.0 GB CPU Intel Core(TM) i7-14700KF CPU count 28 GPU NVIDIA GeForce RTX 5090, 32101MiB GPU count 2 CUDA 12.8 numpy ✅ 1.26.4>=1.23.0 matplotlib ✅ 3.10.5>=3.3.0 opencv-python ✅ 4.12.0.88>=4.6.0 pillow ✅ 11.3.0>=7.1.2 pyyaml ✅ 5.4.1>=5.3.1 requests ✅ 2.32.5>=2.23.0 scipy ✅ 1.15.3>=1.4.1 torch ✅ 2.9.0.dev20250811+cu128>=1.8.0 torch ✅ 2.9.0.dev20250811+cu128!=2.4.0,>=1.8.0; sys_platform == "win32" torchvision ✅ 0.24.0.dev20250811+cu128>=0.9.0 psutil ✅ 7.0.0>=5.8.0 polars ✅ 1.37.1>=0.20.0 ultralytics-thop ✅ 2.0.18>=2.0.18 ### Minimal Reproducible Example [yolov8m-seg-p2.yaml](https://github.com/user-attachments/files/24715627/yolov8m-seg-p2.yaml) [yolov8m-seg-p6.yaml](https://github.com/user-attachments/files/24715626/yolov8m-seg-p6.yaml) [train.py](https://github.com/user-attachments/files/24715651/train.py) [config.yaml](https://github.com/user-attachments/files/24715720/config.yaml) [yolov8m-seg-p2.yaml](https://github.com/user-attachments/files/24715721/yolov8m-seg-p2.yaml) [train.py](https://github.com/user-attachments/files/24715719/train.py) [yolov8m-seg-p6.yaml](https://github.com/user-attachments/files/24715718/yolov8m-seg-p6.yaml) train.py from ultralytics import YOLO if __name__ == '__main__': model = YOLO('./yolov8m-seg-p2.yaml') model.load('./yolov8m-seg.pt') model.train(cfg='./config.yaml') # model = YOLO('./yolov8m-seg-p6.yaml') # model.load('./yolov8m-seg.pt') # model.train(cfg='./config.yaml') ### Additional _No response_ ### Are you willing to submit a PR? - [ ] Yes I'd like to help by submitting a PR!

krstm · January 22, 2026, 8:33pm

is this transferable to yolo26? have to be honest I dont follow. Could I ask if you could elaborate the issue and how it can be resolved?

krstm · January 22, 2026, 9:05pm

oh… looks like this is functioning:

thats kind of neat

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com

# Ultralytics YOLO26-seg instance segmentation model with P2/4 - P5/32 outputs
# Model docs: https://docs.ultralytics.com
# Task docs: https://docs.ultralytics.com

# Parameters
nc: 80 # number of classes
end2end: True # whether to use end-to-end mode
reg_max: 1 # DFL bins
scales: 
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] 
  s: [0.50, 0.50, 1024] 
  m: [0.50, 1.00, 512] 
  l: [1.00, 1.00, 512] 
  x: [1.00, 1.50, 512] 

# YOLO26n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2, [256, False, 0.25]] # 2
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2, [512, False, 0.25]] # 4
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2, [512, True]] # 6
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2, [1024, True]] # 8
  - [-1, 1, SPPF, [1024, 5, 3, True]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

# YOLO26n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 11
  - [[-1, 6], 1, Concat, [1]] # 12 cat backbone P4
  - [-1, 2, C3k2, [512, True]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 14
  - [[-1, 4], 1, Concat, [1]] # 15 cat backbone P3
  - [-1, 2, C3k2, [256, True]] # 16 (P3/8-small)

  # P2 addition
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]] # 17
  - [[-1, 2], 1, Concat, [1]] # 18 cat backbone P2
  - [-1, 2, C3k2, [128, True]] # 19 (P2/4-xsmall)

  - [-1, 1, Conv, [128, 3, 2]] # 20
  - [[-1, 16], 1, Concat, [1]] # 21 cat head P3
  - [-1, 2, C3k2, [256, True]] # 22 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]] # 23
  - [[-1, 13], 1, Concat, [1]] # 24 cat head P4
  - [-1, 2, C3k2, [512, True]] # 25 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]] # 26
  - [[-1, 10], 1, Concat, [1]] # 27 cat head P5
  - [-1, 1, C3k2, [1024, True, 0.5, True]] # 28 (P5/32-large)

  # Segment26 layer with required index ordering: P3, P2, P4, P5
  - [[22, 19, 25, 28], 1, Segment26, [nc, 32, 256]]

pderrenger · January 23, 2026, 1:33pm

Yep — that works, and the key detail is exactly what you noted: the input order to Segment26 matters.

Seg models assume the mask prototypes come from a feature map at a specific downsample ratio (historically the “P3-first” layout), and parts of the mask pipeline/validation can implicitly expect that ratio. If you pass P2 (stride 4) as the first input, your proto/mask scaling becomes different and you hit the classic mat1 and mat2 shapes cannot be multiplied during mask IoU. Keeping P3 as the first input (like your [[22, 19, 25, 28], 1, Segment26, ...] i.e. P3, P2, P4, P5) preserves the expected proto scaling while still letting you benefit from P2 features. The Segment26 head wiring is in the head module reference if you want to trace where proto is produced.

For your ~200×200 images, you’ll generally get more stable results if you train with an imgsz that’s a clean multiple of stride (e.g. imgsz=256), rather than 200. If you share your train command + whether you’re fine-tuning from yolo26n-seg.pt, I can suggest the cleanest way to transfer weights into this modified P2 head.

Topic		Replies	Views
:fire: YOLO26 Available Now News yolo , announcements , ultralytics-official	1	276	January 14, 2026
Training an instance segmentation model WITHOUT cropping. [Yolov11] Discussion discussion	1	142	September 6, 2025
Why predictions are different in image segmentation after some times? YOLO question	1	159	October 23, 2024
Larger obejcts detection boxes being cropped YOLO	5	274	August 20, 2025
YOLO is not reading correctly the training data YOLO yolo , question , support	9	51	March 27, 2026

Yolo26 seg p2 available? possible?

Related topics