Multi−GPU train - NVIDIA 5090

sneaky_advisor · October 8, 2025, 6:44am

Windows Multi−GPU / YOLO11 from Python API won’t start multi-GPU — always runs on cuda:0

GPUs: 4× NVIDIA GeForce RTX 5090 (32 GB)
Driver: 577.00 (CUDA driver 12.9)
OS: Windows 11 Pro
Python: 3.11.11 (Conda)
PyTorch: 2.8.0+cu128
Ultralytics: 8.3.127 (CLI suggests updating to 8.3.206)

Problem: Training via the Python API with device=[0,1,2,3] always runs on a single GPU (cuda:0) — DDP never initializes.What I’m looking for:How to start multi-GPU (DDP) from the Python API on Windows in this version?Is this a known issue in 8.3.127, and does updating to 8.3.206 fix API-side DDP on Windows?

Toxite · October 8, 2025, 12:57pm

Can you post the training logs?

Can you provide the output after running this command in terminal: yolo checks?

sneaky_advisor · October 8, 2025, 1:30pm

yolo checks output:

Ultralytics 8.3.127 Python-3.11.11 torch-2.8.0+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32607MiB)
Setup complete (32 CPUs, 127.8 GB RAM, 181.8/378.5 GB disk)

OS Windows-10-10.0.26100-SP0
Environment Windows
Python 3.11.11
Install pip
Path
RAM 127.79 GB
Disk 181.8/378.5 GB
CPU AMD EPYC 9124 16-Core Processor
CPU count 32
GPU NVIDIA GeForce RTX 5090, 32607MiB
GPU count 4
CUDA 12.8

numpy 1.26.4>=1.23.0
matplotlib 3.10.6>=3.3.0
opencv-python 4.11.0.86>=4.6.0
pillow 11.3.0>=7.1.2
pyyaml 6.0.3>=5.3.1
requests 2.32.5>=2.23.0
scipy 1.16.2>=1.4.1
torch 2.8.0+cu128>=1.8.0
torch 2.8.0+cu128!=2.4.0,>=1.8.0; sys_platform == “win32”
torchvision 0.23.0+cu128>=0.9.0
tqdm 4.67.1>=4.64.0
psutil 7.1.0
py-cpuinfo 9.0.0
pandas 2.3.3>=1.1.4
seaborn 0.13.2>=0.11.0
ultralytics-thop 2.0.17>=2.0.0

BurhanQ · October 8, 2025, 2:15pm

I wish I had a setup like that to test with!

A quick two questions.

If you try training with device=1 or device=2 does it train on that GPU or does it still choose the first GPU?
What’s the output of nvidia-smi show?

sneaky_advisor · October 9, 2025, 12:11pm

Yes, learning takes place on the selected GPU.

Toxite · October 9, 2025, 1:09pm

What’s the training logs with multi GPU training? Can you post the training logs? Does it show DDP?

Topic		Replies	Views
I am seeing major improvements in my model and the only change has been the machine it is trained on YOLO troubleshooting	3	388	April 29, 2025
New Release: Ultralytics v8.3.218 Discussion releases , announcements , ultralytics-official	0	55	October 21, 2025
Ultralytics 8.3.6 with torch==2.6.0 + torchvision==0.21.0 is silently falling back to CPU during model.predict() despite .to('cuda') and device=0 YOLO yolo , question , code	3	253	July 10, 2025
New Release: Ultralytics v8.4.6 Discussion releases , announcements , ultralytics-official	0	19	January 19, 2026
New Release: Ultralytics v8.3.183 Discussion releases , announcements , ultralytics-official	0	86	August 21, 2025

Multi−GPU train - NVIDIA 5090

Related topics