Ultralytics v8.3.199 — faster imports, smoother exports, modern GPU Docker 

Ultralytics v8.3.199 delivers quicker startup with lazy model loading, more stable export/runtime behavior, and refreshed GPU Docker docs. Expect faster imports, safer defaults, and clearer tooling across the board.
- Version: v8.3.199
- TL;DR: Faster
import ultralytics, standardized export outputs, smarter TensorRT installs, safertorch.compile, better tuning plots, updated GPU docs, and stronger CI coverage. - Note: YOLO11 remains our latest stable and recommended model for all use cases.
Highlights
- Faster imports via lazy model loading while preserving the public API
- Consistent export outputs for quantized NMS, simplifying downstream integration
- Automatic, CUDA-matching TensorRT installs on Linux
- Safer
torch.compiledefaults to avoid CUDA Graphs pitfalls - Cleaner Tuner plots by excluding zero-fitness points by default
- Modernized GPU Docker guidance using NVIDIA Container Toolkit
- Re-enabled GPU export tests and benchmarks where GPUs are available
New Features
- Lazy model loading for faster imports (about ~3% speedup) using
__getattr__, keeping the same API surface. See the PR 3% improvement in 3% Faster Imports with Lazy Loading (PR #21985) by RizwanMunawar.
Improvements
- More consistent export outputs: quantized export NMS now returns unpackable tensors
(boxes, scores, labels, n_valid)for non-keypoint tasks. Details in Fix imx object detection export outputs (PR #22045) by Laughing-q. - Smarter TensorRT installation on Linux: auto-selects CUDA-matching wheels (e.g.,
tensorrt-cu12) and avoids known-bad versions. See Specify CUDA version during TensorRT installation (PR #22060) by Y-T-G. - Safer
torch.compiledefaults:attempt_compile()now warns onmode="max-autotune"and switches tomax-autotune-no-cudagraphs. Learn more in Add warning and default to no-cudagraphs (PR #22040) by Y-T-G. - Clearer hyperparameter tuning plots:
plot_tune_results(..., exclude_zero_fitness_points=True)filters zero-fitness points by default. Implemented in Exclude zero-fitness points in Tuner plots (PR #22047) by glenn-jocher. - GPU test coverage re-enabled: ONNX export with NMS for OBB, CUDA export tests, and GPU benchmarks now run when GPUs are available. See Re-enable TensorRT export in GPU tests (PR #22062) by Laughing-q.
- Docker docs modernized for NVIDIA Container Toolkit with distro-specific steps and standardized
--runtime=nvidia. Updated in Docker Quickstart update (PR #21994) and Standardize GPU Docker commands (PR #22052) by onuralpszr. - CI reliability and maintenance: updated GPU runner label, targeted Slack alerts, and parameterized runner image versions. Details in Update GPU runner label (PR #22051) and Parametrize runner version in Dockerfile (PR #22049) by glenn-jocher, and Refine Slack notifications (PR #22012) by lakshanthad.
- New reference docs for lazy imports: a dedicated reference page now clarifies how lazy imports are implemented in
ultralytics/__init__.py.
Bug Fixes
- Robustness fix in custom model parsing: prevents undefined
scaleerrors inparse_model()whenscalesisn’t provided. Addressed in Fix undefined variable in parse_model (PR #22054) by Y-T-G.
Quick start and helpful snippets
- Update to the latest version:
pip install -U ultralytics
- Import remains the same, with YOLO11 as the recommended default:
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
- Show zero-fitness points in Tuner plots (previous behavior):
from ultralytics.utils.plotting import plot_tune_results
plot_tune_results("tune_results.csv", exclude_zero_fitness_points=False)
- Run the GPU Docker image using NVIDIA Container Toolkit:
sudo docker run -it --ipc=host --runtime=nvidia --gpus all ultralytics/ultralytics:latest
What’s changed (PR roll-up)
- Docs refresh for GPU containers in Docker Quickstart update (PR #21994) by onuralpszr
- Export output consistency in Fix imx object detection export outputs (PR #22045) by Laughing-q
- CI notifications refined in Fix Slack notifications on scheduled CI failure (PR #22012) by lakshanthad
- Cleaner tuning plots in Exclude zero-fitness points in Tuner plots (PR #22047) by glenn-jocher
- Safer compile defaults in Add warning when using mode=‘max-autotune’ (PR #22040) by Y-T-G
- Parameterized runner images in Dockerfile-runner update (PR #22049) by glenn-jocher
- Updated GPU runners in A100 GPU DDP runners (PR #22051) by glenn-jocher
- GPU export tests re-enabled in TensorRT export in GPU tests (PR #22062) by Laughing-q
- Parse fix in Fix undefined variable in parse_model() (PR #22054) by Y-T-G
- Standardized GPU Docker commands in Use NVIDIA runtime for GPU support (PR #22052) by onuralpszr
- Smarter TensorRT installs in Specify CUDA version during TensorRT installation (PR #22060) by Y-T-G
- Faster imports in 3% Faster Ultralytics Imports with Lazy Model Loading (PR #21985) by RizwanMunawar
You can explore the detailed release page in the announcement for Ultralytics v8.3.199 and review every commit in the full changelog from v8.3.198 to v8.3.199.
Try it and share feedback
Upgrade, run your workloads, and let us know how it feels—especially import times, export stability, and GPU Docker usability. Your feedback helps the community and the Ultralytics team keep improving.