Post-training quantization methods support for YOLO models in TensorRT format

Allan_K · April 15, 2025, 2:37pm

Hi everyone,
I’ve been reviewing the Ultralytics documentation on TensorRT integration for YOLOv11, and I’m trying to better understand what post-training quantization (PTQ) methods are actually supported when exporting YOLO models to TensorRT.
From what I’ve gathered, it seems that only static PTQ with calibration is supported, specifically for INT8 precision. This involves supplying a representative calibration dataset during export or conversion. Aside from that, FP16 mixed precision is available, but that doesn’t require calibration and isn’t technically a quantization method in the same sense.
I’m really curious about the following:

Is INT8 with calibration really the only PTQ option available for YOLO models in TensorRT?
Are there any other quantization methods (e.g., dynamic quantization) that have been successfully used with YOLO and TensorRT?
Appreciate any insights or experiences you can share—thanks in advance!

BurhanQ · April 15, 2025, 3:22pm

The TensorRT INT8 quantization that’s supported in ultralytics is the same as what is supported via the TensorRT Python API. What would the interest be in using dynamic quantization? AFAIK, using dynamic quantization would result in slower inference speeds, as it will have to calculate activations at inference time and some may require floating point operations.

I think anyone seeking to use dynamic quantization is going to benefit more from collecting additional data to calibrate the exported model with. Since dynamic quantization is supposed to be “more flexible” than static, where that flexibility pertains to the data at inference time vs calibration. Monitoring inference performance and collecting data means that one can alway export a model again and recalibrate on updated examples, where as incorporating dynamic quantization is likely to incur performance penalties that are undesirable.

Allan_K · April 15, 2025, 3:41pm

Thank you so much for clarification👍

pderrenger · April 16, 2025, 1:27pm

Hi Allan_K,

Glad we could help clarify things for you! Let us know if any other questions come up.

Topic		Replies	Views
I would like to quantize my custom trained model YOLO question	1	234	January 5, 2025
Yolov11 pruning and quantizing YOLO yolo , question	3	500	April 9, 2025
Does Yolov8 includes input normalization pipeline? Discussion yolo , question , support , discussion , code	3	77	January 10, 2025
New Release: Ultralytics v8.3.156 Discussion releases , announcements , ultralytics-official	0	9	June 17, 2025
Yolo11 quantization YOLO yolo , question , support , troubleshooting	1	11	June 19, 2025

Post-training quantization methods support for YOLO models in TensorRT format

Related topics