YOLO architecture

hallian179 · February 10, 2025, 3:25pm

Hi, I want to do object detection using YOLO. I have some real confusion regarding the usage of yaml and .pt file. Kindly reply the following questions,

I have changed the architecture of yolov11 and I want to compare with unmodified version. I know I have to build the model with yaml file for modified version but for comparison shall I also build the model with unmodified yaml file or use .pt file instead?
If I use .pt file and then compare it with the unmodified version of yaml file the results are drastically different. Which is the correct way to guage the architecture change?

BurhanQ · February 10, 2025, 3:28pm

You can use the YAML file for comparison when you’re training on the same data from scratch.
Really the same kind of question as above.

You will likely need to train your custom model from scratch since you’re using a custom YAML file. This means you can just load the default YOLO11 model also using the YAML config.

from ultralytics import YOLO

custom = YOLO("custom_model.yaml", task="detect")
standard = YOLO("yolo11.yaml")

custom_result = custom.train(data="custom_dataset.yaml")
standard_result = standard.train(data="custom_dataset.yaml")

hallian179 · February 10, 2025, 4:14pm

That means for comparison also I build the model using yaml file.
But theoretically speaking for custom dataset building a model from yaml file and using pt shouldn’t be same if there is no modification in the architecture? If that’s true why am I getting drastically different results?

BurhanQ · February 10, 2025, 4:24pm

A custom model from would be the same after training the model for either the .pt file or the YAML file. When you build your custom model from the YAML, the weights are randomized and the model is not trained, when building from the .pt file (assuming it was trained), then it will have weights updated based on the training data. The same goes for the standard YOLO11 models.

hallian179 · February 12, 2025, 7:33am

Thanks alot. But here it what I am facing. This is the code snippet based on yaml file.

As per your statement above " A custom model from would be the same after training the model for either the .pt file or the YAML file" they should have same results. but here is what I am getting. These are validation results.

the run with yaml file has following results
YOLO11 summary (fused): 238 layers, 2,582,737 parameters, 0 gradients, 6.3 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 1/1 [00:00<00:00, 1.50it/s]
all 20 65 0.801 0.415 0.457 0.257
head 3 18 0.732 0.611 0.687 0.41
helmet 17 45 0.671 0.635 0.649 0.345
person 1 2 1 0 0.037 0.0148

Also the auto-optimizer settings are as follows.
yaml run:
optimizer: AdamW(lr=0.001429, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0)

hallian179 · February 12, 2025, 7:36am

and below is the code snippet of run with .pt file

and the validation results are

YOLO11m summary (fused): 303 layers, 20,032,345 parameters, 0 gradients, 67.7 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 1/1 [00:09<00:00, 9.38s/it]
all 20 65 0.898 0.51 0.805 0.476
head 3 18 0.761 0.709 0.788 0.445
helmet 17 45 0.933 0.822 0.879 0.533
person 1 2 1 0 0.75 0.45

And the auto-optimizer settings are as follows
optimizer: AdamW(lr=0.001429, momentum=0.9) with parameter groups 106 weight(decay=0.0), 113 weight(decay=0.0005), 112 bias(decay=0.0)

Can you please explain why is this difference there?
Also, which is the best way to run?
With which method shall I compare the architecture modification?

Toxite · February 12, 2025, 10:04am

Posting screenshots of code instead of pasting the code makes everything hard to read.

Just post the code directly with proper formatting.

And the reason for the performance discrepancy is because you’re not specifying the scale while loading the yaml. It should be yolo11m.yaml. There’s also a warning that says you that scale wasn’t passed and it would be defaulting to n.

Also you don’t need to pass the full path to the yolo11 yaml. Just pass yolo11m.yaml. Ultralytics will automatically use the stock yaml.

hallian179 · February 12, 2025, 11:41am

Noted with thanks. here are the codes.
The 1st one is utilizes the yaml file.

import os

if name == “main”:
# 1. Optional memory config for CUDA
os.environ[“PYTORCH_CUDA_ALLOC_CONF”] = “expandable_segments:True”

import torch
from ultralytics import YOLO
from ultralytics.nn.tasks import yaml_model_load

# ----------------------------------------------------------------------
# USER-CONFIGURABLE PATHS
# ----------------------------------------------------------------------
# Path to your custom dataset YAML
data_yaml_path = r"D:/ultralytics/codes/Hard Hat/data.yaml"

# Path to your YOLOv11 YAML
model_yaml_path = r"D:/Python New/Python312/Lib/site-packages/ultralytics/cfg/models/11/yolo11.yaml"

# Optional .pt weights
pretrained_weights = r"D:/Kai work/data_scene_flow/python files/yolo11m.pt"

# ----------------------------------------------------------------------
# CHECK GPU / CPU
# ----------------------------------------------------------------------
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}\n")

# ----------------------------------------------------------------------
# DEBUG: Print out the custom yolo11.yaml
# ----------------------------------------------------------------------
print("Loading the model YAML for debugging...")
model_yaml = yaml_model_load(model_yaml_path)
print(f"Loaded Model YAML:\n{model_yaml}")

# ----------------------------------------------------------------------
# BUILD MODEL FROM YAML + OPTIONALLY LOAD PRETRAINED WEIGHTS
# ----------------------------------------------------------------------
print("\nInitializing YOLO model from YAML...")
model = YOLO(model_yaml_path)  # Build from your updated yolo11.yaml

# Optionally load the pretrained weights into the architecture
if pretrained_weights:
    print(f"Loading pretrained weights from {pretrained_weights}...")
    model.load(pretrained_weights)

# ----------------------------------------------------------------------
# CLEAR UNUSED VRAM
# ----------------------------------------------------------------------
torch.cuda.empty_cache()

# ----------------------------------------------------------------------
# TRAIN
# ----------------------------------------------------------------------
print("\nStarting training...")
model.train(
    data=data_yaml_path,  # dataset config
    epochs=700,  # or however many you like
    batch=16,  # adjust to fit your GPU memory
    imgsz=640,  # image size
    device=device,
    workers=0,  # can increase for multi-CPU data loading
    cache=False,  # no caching for quick tests
    amp=True,  # automatic mixed precision
    name="custom_yolov11_trial_model"
)
print("Training complete.")

And below is the code which uses .pt file

import os
import torch
from ultralytics import YOLO
from ultralytics.nn.tasks import yaml_model_load

if name == “main”:
os.environ[“PYTORCH_CUDA_ALLOC_CONF”] = “expandable_segments:True”

# Paths
data_yaml_path = r"D:/ultralytics/codes/Hard Hat/data.yaml"
# For baseline, use the standard model configuration or pretrained checkpoint
baseline_model_checkpoint = r"D:/Kai work/data_scene_flow/python files/yolo11m.pt"  # standard pretrained checkpoint

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

# Initialize baseline model directly from the checkpoint
print("\nInitializing baseline YOLO model...")
model = YOLO(baseline_model_checkpoint)

torch.cuda.empty_cache()

# Train the baseline model
print("\nStarting training for baseline model...")
model.train(
    data=data_yaml_path,
    epochs=700,  # same as custom experiment
    batch=16,
    imgsz=640,
    device=device,
    workers=0,
    cache=False,
    amp=True,
    name="baseline_yolov11_model"
)
print("Training complete for baseline model.")

hallian179 · February 12, 2025, 11:42am

Kindly go through these codes and please help me sort this issue out.

BurhanQ · February 12, 2025, 2:30pm

As Toxite mentioned, without a scale parameter, the models are not comparable. This is apparent when looking at the difference in the parameter counts.

That said, meant that the custom model YAML and your custom modified model weights are supposed to be the same. In general, one should not expect a custom model structure to match the Ultralytics pretrained model performance.

hallian179 · February 13, 2025, 6:13am

Thanks a lot Toxite and BurhanQ…I got the point.
I corrected it and got the same results now.
If I face any further issues I will ask you guys again.

Topic		Replies	Views
Train a "clean" Yolov12 model (not pre-trained) on a custom dataset YOLO yolo , question , support , code	3	60	June 6, 2025
Yolov5 to yolov11 YOLO yolov5 , question , code , yolo11	18	852	January 16, 2025
Modifying yolo11 architecture to have one backbone and 2 necks and heads Discussion yolo , question , support , discussion , code	4	588	March 16, 2025
Adding a new head to the YOLO11n model to detect very small objects Discussion support , code	21	1170	April 2, 2025
The use of weight file (.pt) Discussion question	1	289	November 2, 2024

YOLO architecture

Related topics