Ultralytics v8.4.13 is here β smarter training resilience + more reliable exports
Ultralytics v8.4.13 makes training more resilient by automatically recovering from CUDA out-of-memory (OOM) errors during the first epoch by retrying with a smaller batch size ![]()
![]()
![]()
New Features (Highlights)
Auto-retry on CUDA OOM during training (major change)
If a CUDA OOM happens in the first epoch on single-GPU, Ultralytics will now:
- retry up to 3 times
- halve
batcheach retry (down to1) - rebuild the training pipeline (dataloaders + optimizer + scheduler) for a clean continuation

This reduces the classic loop of βOOM β lower batch β restart trainingβ and helps especially with first-epoch memory spikes ![]()
Export Improvements
More reliable ONNX export for OBB + NMS
When exporting OBB (oriented bounding boxes) to ONNX with NMS enabled, simplify=True is now forced to avoid a known runtime issue (TopK-related errors in some ONNX Runtime versions). Fewer surprises at deployment time ![]()
System & Tooling Reliability
DGX system detection + TensorRT handling
Adds DGX detection and uses it (along with Jetson JetPack 7) to trigger a TensorRT version check/reinstall path for improved export reliability on those systems.
Packaging stability fix: pin setuptools
Build requirements are pinned to setuptools<=81.0.0 to avoid breakages introduced by newer setuptools versions (notably affecting tensorflow.js export tooling).
Docs & Examples Refresh
Clearer guidance aligned with Ultralytics YOLO + YOLO26
Docs and examples continue steering toward YOLO26 as the recommended model for new projects (smaller, faster, more accurate than YOLO11, and natively end-to-end). If youβre new or upgrading pipelines, start with yolo26n.pt.
Tracking content update
Tracking docs now embed a newer multi-object tracking video featuring YOLO26 + BoT-SORT/ByteTrack.
Example dependency update
The RT-DETR ONNX Runtime Python example updates protobuf.
Internal Changes (for contributors)
- Adds
_build_train_pipeline()to rebuild loaders/optimizer/scheduler when the batch size changes (used by the new OOM recovery flow).
Quick start
Update to the latest release:
pip install -U ultralytics
Train with Ultralytics YOLO (example):
yolo detect train model=yolo26n.pt data=coco128.yaml imgsz=640
Want the easiest end-to-end workflow for data, training, and deployment? Try Ultralytics Platform as part of your pipeline with Ultralytics Platform.
Whatβs Changed (PRs)
- DGX device variants check by @onuralpszr in PR #23573
- Add tracking video to docs by @RizwanMunawar in PR #23582
- Bump
protobufin RT-DETR ONNX Runtime example by @dependabot[bot] in PR #23572 - Exporter docs updates (new formats + examples) by @onuralpszr in PR #23585
- Force
simplify=Truefor OBB export with NMS by @Y-T-G in PR #23580 - Pin
setuptoolsversion by @Burhan-Q in PR #23589 - Retry smaller batch on training CUDA OOM by @glenn-jocher in PR #23590
Read the full details in the v8.4.13 GitHub Release and browse the full changelog comparison.
Feedback welcome
Please try v8.4.13 and let us know:
- Did the first-epoch OOM auto-retry save you a restart?

- Any edge cases where youβd like different retry behavior?
- ONNX OBB + NMS exporting smoothly in your runtime?
Your reports help improve YOLO for everyone β thanks to the whole Ultralytics team and the broader YOLO community for pushing this forward.