Export with int8 quantization gives 0 at the output

Hello,

I am trying to implement a YOLO model. I have customized the yolo26.yaml file to my own needs, where some of the operations are not supported by the TFLM library so I changed some of the conv blocks used. However, whenever I export it as tflite with int8 quantization using the export function, the output is always 0 for the detections. When I tested as .pt or .tflite_float32 the model performs as it should.

Could you please enlighten me on what could be the problem?

Thank you

end2end models don’t work with static quantization. You can only use the dynamic quantized file which ends with _int8.tflite

Thank you for your reply.

However, I already disabled end2end in the .yaml file. Wouldn’t this overcome the problem?

Is the output 0 with _int8.tflite file?

If so, there’s something incompatible with your custom layers. Because the default YOLO non end2end models work fine.

No, the output of int8 works normal. But I cannot use dynamic quantization because my main goal is to implement the model on an NPU equipped MCU. This dynamic quantization makes the input and output float32 and my converter does not accept it.

Is there any way to obtain static quantization with a working model? Like, should I use yolov8 or yolov5 to obtain this?

Static quantization works with YOLOv8 and YOLO11. And also YOLO26 if you export with end2end=False

Could you please tell me how to do it ?

I tried with end2end = False and also yolov8 but i still get 0 at the output of the model when i run the full quant file.

Thank you for your time.

Are you using latest Ultralytics?

Yes, I believe so.

It works fine for me with latest Ultralytics:

image 1/2 /ultralytics/ultralytics/assets/bus.jpg: 640x640 4 persons, 1 bus, 22.2ms
image 2/2 /ultralytics/ultralytics/assets/zidane.jpg: 640x640 3 persons, 21.0ms
Speed: 11.3ms preprocess, 21.6ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 640)