Hi everyone,
I am trying to reproduce the 78.4% mAP50 reported for YOLO11n-OBB on DOTA-v1.0 (as shown in the official documentation: https://docs.ultralytics.com/tasks/obb/#visual-samples).
Below is a detailed description of my setup and process:
Environment:
- python: 3.10
- Ultralytics: 8.3.53
- PyTorch version: 2.9.0+cu128
- CUDA version: 12.8
- OS: Ubuntu 22.04.5
- GPU: RTX 4090
Dataset Preparation
-
Dataset: DOTA-v1.0 (downloaded from the official DOTA dataset website)
-
Test set cropped using DOTA_devkit / ImgSplit_multi_process
-
gap = 200 -
subsize = 800 -
num_process = 8
-
-
After cropping, each test image is 800Ă—800 in size.
Inference Setup
-
imgsz = 1024 -
Model: yolo11n-obb.pt (downloaded from the Ultralytics official release page)
-
Task:
obb -
Dataset config:
data=dota.yaml(standard DOTA-v1.0 format) -
Inference performed with my script (attached below).
-- coding: utf-8 --
import os
from pathlib import Path
from ultralytics import YOLO
def main():
# Model and data paths
model = YOLO(“sj_best_0611.pt”)
# Same as the official setting: specify data yaml, split = test
results = model.predict(
source=r"divided_test_img", # directory containing test images
imgsz=1024,
task="obb",
device="0",
stream=True
)
out_dir = Path(r"my_output_dir")
out_dir.mkdir(exist_ok=True)
# DOTA v1 class names
DOTA_CLASSES = [
'plane', 'ship', 'storage-tank', 'baseball-diamond', 'tennis-court',
'basketball-court', 'ground-track-field', 'harbor', 'bridge',
'large-vehicle', 'small-vehicle', 'helicopter', 'roundabout',
'soccer-ball-field', 'swimming-pool'
]
# Initialize empty result files
files = {c: open(out_dir / f"Task1_{c}.txt", "w") for c in DOTA_CLASSES}
for res in results:
imgname = Path(res.path).stem
for box in res.obb:
cls = int(box.cls)
conf = float(box.conf)
poly = box.xyxyxyxy.reshape(-1).tolist()
line = f"{imgname} {conf:.6f} " + " ".join(f"{p:.2f}" for p in poly) + "\n"
files[DOTA_CLASSES[cls]].write(line)
for f in files.values():
f.close()
if __name__ == "__main__":
main()
Result Merging
-
Used ResultMerge_multi_process from DOTA_devkit
-
nms_thresh = 0.1 -
Generated final merged result files and zipped them for submission.
Evaluation
Submitted the merged results to the DOTA official evaluation server, but my test result is only 0.731 mAP50,which is lower than the 78.4% reported in the YOLO official documentation.
Question
Could you please help me identify what might cause this performance gap?
-
Are there specific image split parameters, test-time augmentations, or NMS settings used in the official benchmark?
-
Should I modify my imgsz, gap, or merge thresholds to better match the official setup?
-
Also, should I be using an official tool to convert YOLO predictions into the DOTA server’s accepted format, instead of writing my own conversion script? If such a tool exists, could you please let me know where to find it?
Thank you very much for your time and for maintaining this amazing project!