How to export castom yolo5.pt to tensorrt?

How can I convert my YOLOv5 model (exported as ONNX) to TensorRT? I followed the export.py script, and the ONNX file was generated successfully (crowdhuman_yolov5m.onnx), but I’m unsure about the next steps for TensorRT integration. Are there specific tools or steps required for this?

Additional context:

  • ONNX export completed with warnings about tensor-to-boolean conversion (TracerWarning).
  • Need guidance on optimizing the ONNX model for TensorRT (e.g., using trtexec or Python API).
  • Any tips to avoid errors during TensorRT conversion?
(venv) shakir@laptop:~/Downloads/yolov5$ python export.py --weights ../traffic_analyzer/weights/crowdhuman_yolov5m.pt --include onnx
   
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
export: data=data/coco128.yaml, weights=../traffic_analyzer/weights/crowdhuman_yolov5m.pt, imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5 🚀 v6.0-0-g956be8e6 torch 2.7.0+cu126 CPU

Fusing layers... 
Model Summary: 308 layers, 21041679 parameters, 0 gradients

PyTorch: starting from ../traffic_analyzer/weights/crowdhuman_yolov5m.pt (169.0 MB)

ONNX: starting export with onnx 1.18.0...
/home/shakir/Downloads/yolov5/models/yolo.py:124: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if augment:
/home/shakir/Downloads/yolov5/models/yolo.py:147: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if profile:
/home/shakir/Downloads/yolov5/models/yolo.py:151: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if visualize:
/home/shakir/Downloads/yolov5/models/yolo.py:151: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if visualize:
/home/shakir/Downloads/yolov5/models/yolo.py:147: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if profile:
/home/shakir/Downloads/yolov5/models/yolo.py:60: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic:
ONNX: export success, saved as ../traffic_analyzer/weights/crowdhuman_yolov5m.onnx (84.6 MB)
ONNX: run --dynamic ONNX model inference with: 'python detect.py --weights ../traffic_analyzer/weights/crowdhuman_yolov5m.onnx'

Export complete (2.74s)
Results saved to /home/shakir/Downloads/traffic_analyzer/weights
Visualize with https://netron.app
(venv) shakir@laptop:~/Downloads/yolov5$

Hello! It’s great that you’ve successfully exported your YOLOv5 model to ONNX format.

For converting to TensorRT, the most straightforward approach is often to export directly from your .pt file to the .engine format using the ultralytics Python package. This handles the intermediate ONNX conversion and potential optimizations seamlessly.

You can do this with a simple script:

from ultralytics import YOLO

# Load your custom YOLOv5 .pt model
model = YOLO("../traffic_analyzer/weights/crowdhuman_yolov5m.pt") 

# Export the model to TensorRT format
# This will create a .engine file, e.g., crowdhuman_yolov5m.engine
model.export(format="engine") 

For optimization, you can include arguments like half=True for FP16 precision, int8=True for INT8 quantization (which requires a calibration dataset via the data argument), dynamic=True for dynamic input axes, or workspace to specify GPU memory for TensorRT. For example: model.export(format="engine", half=True, imgsz=640).

The TracerWarnings you encountered during the ONNX export are common with PyTorch’s tracing mechanism and usually don’t prevent a successful conversion to TensorRT, especially when exporting directly to the engine format as the process is managed internally.

If you prefer to use the ONNX file you’ve already generated (crowdhuman_yolov5m.onnx) directly, NVIDIA’s trtexec command-line tool (part of the TensorRT toolkit) can be used to convert ONNX models to TensorRT engines and perform benchmarking.

For more comprehensive information on export options and TensorRT integration, please see our documentation on Model Export with Ultralytics YOLO and the specific TensorRT Export for YOLO11 Models guide.

Try using --include engine instead

The export takes place instantly, which is strange. And the engine model doesn’t appear

python export.py --weights ../traffic_analyzer/weights/crowdhuman_yolov5m.pt --include engine

/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class ‘numpy.float64’> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class ‘numpy.float64’> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:518: UserWarning: The value of the smallest subnormal for <class ‘numpy.float32’> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class ‘numpy.float32’> type is zero.
return self._float_to_str(self.smallest_subnormal)
export: data=data/coco128.yaml, weights=../traffic_analyzer/weights/crowdhuman_yolov5m.pt, imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=[‘engine’]
YOLOv5 :rocket: v6.0-0-g956be8e6 torch 2.7.0+cu126 CPU

Fusing layers…
Model Summary: 308 layers, 21041679 parameters, 0 gradients

PyTorch: starting from ../traffic_analyzer/weights/crowdhuman_yolov5m.pt (169.0 MB)

Export complete (0.99s)
Results saved to /home/shakir/Downloads/traffic_analyzer/weights
Visualize with https://netron.app
(venv) shakir@laptop:~/Downloads/yolov5$ code ../traffic_analyzer/
(venv) shakir@laptop:~/Downloads/yolov5$ ls ../traffic_analyzer/weights/
convert_model.py crowdhuman_yolov5m.onnx model_data.py test.py yolov8m.engine yolov8_tensorrt_prepared_models_onnx2tensorrt.py
convert_onnx_to_trt.py crowdhuman_yolov5m.pt pycache yolo_trt_infer.py yolov8m.pt

I can share the model with you, you can test it

That’s a very old version of the repo. At least 4 years old. You should try with latest repo.

1 Like

(venv) shakir@laptop:~/Downloads/yolov5$ git describe --tags
v7.0-419-gcd44191c
(venv) shakir@laptop:~/Downloads/yolov5$ python export.py --weights ../traffic_analyzer/weights/crowdhuman_yolov5m.pt --include engine
export: data=data/coco128.yaml, weights=[‘../traffic_analyzer/weights/crowdhuman_yolov5m.pt’], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, keras=False, optimize=False, int8=False, per_tensor=False, dynamic=False, cache=, simplify=False, mlmodel=False, opset=17, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=[‘engine’]
YOLOv5 :rocket: v7.0-419-gcd44191c Python-3.10.12 torch-2.7.0+cu126 CPU

Fusing layers…
Model summary: 308 layers, 21041679 parameters, 0 gradients
Traceback (most recent call last):
File “/home/shakir/Downloads/yolov5/export.py”, line 1546, in
main(opt)
File “/home/shakir/Downloads/yolov5/export.py”, line 1541, in main
run(**vars(opt))
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py”, line 116, in decorate_context
return func(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/export.py”, line 1407, in run
y = model(im) # dry runs
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/models/yolo.py”, line 270, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File “/home/shakir/Downloads/yolov5/models/yolo.py”, line 169, in _forward_once
x = m(x) # run
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/models/common.py”, line 356, in forward
return self.conv(torch.cat((x[…, ::2, ::2], x[…, 1::2, ::2], x[…, ::2, 1::2], x[…, 1::2, 1::2]), 1))
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/models/common.py”, line 91, in forward_fuse
return self.act(self.conv(x))
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py”, line 554, in forward
return self._conv_forward(input, self.weight, self.bias)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py”, line 549, in _conv_forward
return F.conv2d(
RuntimeError: Given groups=1, weight of size [48, 12, 3, 3], expected input[1, 48, 320, 320] to have 12 channels, but got 48 channels instead

Seems like the model was trained with a modified version of the YOLOv5 repo so it’s not compatible with official YOLOv5 repo.

You can use trtexec to convert the ONNX to TensorRT engine manually.

could you send me the command, pls

Hello!

The RuntimeError you’re encountering and the quick, unsuccessful export to TensorRT (engine) format are most likely because TensorRT engine creation requires an NVIDIA GPU, but your export command is running on the CPU (as indicated by device=cpu in your logs).

To export your YOLOv5 .pt model to TensorRT using the ultralytics package, please ensure you have an NVIDIA GPU available, along with the necessary CUDA and TensorRT libraries installed. You can then use the following command:

yolo export model=../traffic_analyzer/weights/crowdhuman_yolov5m.pt format=engine device=0

Please replace device=0 with the correct ID for your GPU if it’s different.

Using the ultralytics package generally offers a more streamlined and up-to-date export process. Our Model Export documentation provides more details on this, and we also have a specific guide for TensorRT integration. The export functionalities within our package, such as the export_engine method described in the Ultralytics Exporter reference, are designed to run on a GPU for TensorRT conversion.

I hope this helps!

No, I always run with gpu

(venv) shakir@laptop:~/Downloads/yolov5$ yolo export model=../traffic_analyzer/weights/crowdhuman_yolov5m.pt format=engine device=0
Ultralytics 8.3.141 :rocket: Python-3.10.12 torch-2.7.0+cu126 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 7817MiB)
Fusing layers…
Model summary: 308 layers, 21041679 parameters, 0 gradients
Traceback (most recent call last):
File “/home/shakir/Downloads/yolov5/venv/bin/yolo”, line 8, in
sys.exit(entrypoint())
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/ultralytics/cfg/init.py”, line 981, in entrypoint
getattr(model, mode)(**overrides) # default args from model
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/ultralytics/engine/model.py”, line 733, in export
return Exporter(overrides=args, _callbacks=self.callbacks)(model=self.model)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/ultralytics/engine/exporter.py”, line 407, in call
y = NMSModel(model, self.args)(im) if self.args.nms and not coreml else model(im)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/models/yolo.py”, line 270, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File “/home/shakir/Downloads/yolov5/models/yolo.py”, line 169, in _forward_once
x = m(x) # run
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/venv/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1762, in _call_impl
return forward_call(*args, **kwargs)
File “/home/shakir/Downloads/yolov5/models/yolo.py”, line 102, in forward
self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 80, 80, 2]
(venv) shakir@laptop:~/Downloads/yolov5$ nvidia-smi
Sun May 25 17:34:09 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144 Driver Version: 570.144 CUDA Version: 12.8 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 … Off | 00000000:01:00.0 Off | N/A |
| N/A 43C P8 2W / 115W | 18MiB / 8188MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2269 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------------------+

Try with the official YOLOv5 weights, you can download the YOLOv5m weights from here https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5m.pt

I dont need new official yolov5, I neeed export my retrained crowd_yolov5.pt to tensorrt

I’m asking you to test with the official YOLOv5 weights as part of the troubleshooting process.