New Release: Ultralytics v8.3.213

Ultralytics v8.3.213 β€” Stability first: automatic NaN recovery, faster Objects365, unified dataset URLs :ambulance::repeat_button::high_voltage:

A stability-focused release that automatically recovers from NaN/Inf training issues, cleans up resume logic, speeds up Objects365 setup, and unifies dataset download URLs. Upgrade with confidence and keep training moving.

  • Version: v8.3.213
  • Focus: Training robustness, dataset reliability, and setup speed
  • Release: See the full notes on the v8.3.213 release page

TL;DR

  • Auto-recovery from NaN/Inf losses and metric collapse during training (DDP-aware, capped retries) :shield:
  • Safer resumes with centralized checkpoint loading and scheduler reset
  • Objects365 preparation runs significantly faster with multithreading :rocket:
  • Unified ASSETS_URL across dataset YAMLs for more reliable downloads :link:

Highlights

Training robustness and resume improvements

  • Automatically detects NaN/Inf losses and fitness collapse, then safely restores from the latest checkpoint with capped retries (up to 3).
  • DDP-aware broadcasting keeps distributed training in sync during recovery.
  • Validates checkpoint weights to avoid reloading corrupted EMA states.
  • Centralizes checkpoint loading via _load_checkpoint_state() and uses it in resume_training() for consistency and less drift.
  • Resets the scheduler state after recovery to maintain the intended LR schedule.
  • Includes a new test test_nan_recovery that injects a NaN to verify the recovery path. :white_check_mark:
    Details are in the NaN epoch recovery PR (#22352) by Glenn Jocher.

Dataset YAMLs: unified asset URLs

Objects365 setup speedups

CI and tests

  • Temporarily disabled Jetson JetPack 5 Docker build while the new recovery path stabilizes.
  • Skips training tests on Jetson/Raspberry Pi to reflect that edge devices are not training targets.

Improvements

  • More resilient training at scale with automatic recovery from NaN/Inf losses or sudden metric collapse.
  • Safer, more consistent resume logic for optimizer, scaler, EMA, and best-fitness states.
  • Faster dataset preparation, especially for Objects365, thanks to multithreading.
  • More reliable dataset downloads via centralized hosting and unified URLs.

Bug Fixes

  • Prevents reloading corrupted EMA states by validating checkpoint weights during recovery.
  • Avoids scheduler desynchronization by resetting LR scheduler after recovery.

Quick start

  • Upgrade:
    pip install -U ultralytics
    
  • Train (recovery is automatic; no extra flags needed):
    yolo detect train model=yolo11n.pt data=coco128.yaml epochs=100 imgsz=640
    
  • Resume anytime:
    yolo detect train resume=True
    

Learn more about training options in the Train mode documentation.


Model guidance

  • YOLO11 is our latest stable and recommended default for all tasks; explore the YOLO11 model docs.
  • Community models YOLO12 and YOLO13 are not recommended; see notes in the YOLO12 model docs.
  • Ultralytics R&D for YOLO26 is underway; follow the YOLO26 R&D preview.

What’s changed (PRs and authors)

Review the complete changelog for v8.3.212 β†’ v8.3.213.


Try it and share feedback

Please upgrade, put the new recovery path through its paces, and let us know how it performs in your workflows. You can start a conversation in Ultralytics GitHub Discussions or open issues with reproducible examples. Your feedback helps the YOLO community and Ultralytics team keep improving.