Does YOLOE leverage prior knowledge when fine-tuning?

Kallinteris-Andreas · August 31, 2025, 7:33pm

When fine-tuning YOLOE (either full fine-tuning or linear probing), does it initialize with it’s already existing embedding to detection knowledge of the
for example, if I am fine-tuning a model to detect “person” and “flame” does it start with its existing knowledge of those classes

In essence, I’m trying to understand if the model starts with a “head start” on the classes it already knows.

simple code example:

from ultralytics import YOLOE
from ultralytics.models.yolo.yoloe import YOLOEPESegTrainer

model = YOLOE("yoloe-11s-seg.pt")

results = model.train(
    data="flame_and_person.yaml", epochs=100, trainer=YOLOEPESegTrainer,
)

If it is the case that fine-tuning on YOLOE leverages prior knowledge, is there a downside to fine-tuning YOLOE11 instead of YOLO11 considering that on exported inference the runtime performance should be identical

Note: I would assume “Catastrophic Forgetting” for all other classes, similarly to closed-set object detector such as YOLO11

Thanks!

Toxite · August 31, 2025, 11:54pm

Yes, the embeddings are pre-calculated for the classes in your dataset and used as initialization to get a head start.

There are no downsides other than slower training compared to regular YOLO. YOLOE probably works better for fine-tuning because it has undergone pretraining on a much larger dataset.

Kallinteris-Andreas · September 2, 2025, 6:30am

To Quantify the “slower training” on my GTX 1650 Ti Mobile (4 GB), no training overhead was measured

Model	Time	VRAM Used
`YOLO11m`	102 minutes and 1.8 seconds	2.46G
`YOLOE11m`	102 minutes and 26.2 seconds	2.45G
Note: time is total script execution time for 5 epochs including initialization evaluations, etc, and VRAM usage as reported by Ultralytics

Based on these measurements, I would conclude that YOLOE fine-tuning has no downsides, other the lack of “nano” and “extra-large” models

Scripts:

# Train Yolo
from ultralytics import YOLO


model = YOLO("yolo11m.pt")

results = model.train(
    project="fire-test-runs",
    name="yolo11m-od-flame-tuning-test",
    data="./fire-flame-1_u_seg/data.yaml",
    epochs=5,
    batch=2,
    device=0,
    close_mosaic=0,
    plots=True,
    save_period=10,
    resume=False,
    exist_ok=False,
    multi_scale=False,
)

# Train YOLOE Object Detector
from ultralytics import YOLOE
from ultralytics.models.yolo.yoloe import YOLOEPETrainer as Trainer

model = YOLOE("yoloe-11m.yaml").load("yoloe-11m-seg.pt")

results = model.train(
    project="fire-test-runs",
    name="yoloe11m-od-flame-full-tuning-test",
    data="./fire-flame-1_u_seg/data.yaml",
    epochs=5,
    batch=2,
    device=0,
    trainer=Trainer,
    close_mosaic=0,
    plots=True,
    save_period=10,
    resume=False,
    exist_ok=False,
    multi_scale=False,
)

BurhanQ · September 2, 2025, 12:39pm

Thanks for sharing your findings!

Topic		Replies	Views
YOLOE full fine-tune vs linear probing best practices Discussion discussion	1	22	August 20, 2025
How to run object detection inference with a YOLOE segmentation model? Discussion code	3	18	September 3, 2025
Add New Classes to (YOLOv8n or YOLO11n) Pretrained Model Without Losing COCO Classes Discussion yolo , question , support	2	437	April 20, 2025
Yoloe inference very slow on jetson with tensorrt Discussion discussion , tensorrt	27	13	September 8, 2025
Finetuning model Support support , discussion	1	67	July 22, 2025

Does YOLOE leverage prior knowledge when fine-tuning?

Related topics