Using ESP-IDF 5.3, espressif/esp-tflite-micro: "^1.3.4", ESP32S3 (16MB FLASH/8MB PSRAM), without a file system, I directly burned the tflite file exported from YOLO11N training and quantization to the model partition of the partition table. I then used esp_partition_read to read the model and simulate the generation of a static image for inference. Model information: Number of subgraphs: 1
ESP-IDF5.3, espressif/esp-tflite-micro: “^1.3.4”,ESP32S3(16MB FLASH/8MB PSRAM),无文件系统,我把YOLO11N训练并量化导出的tflite文件直接烧录到分区表model分区,通过esp_partition_read读取模型并模拟生成一张静态图像做推断,模型信息:Number of subgraphs: 1
Number of tensors: 817
Number of operators: 413,结果基本正确,然而非常耗时,TENSORFLOW计算耗时: 13365554 us,是我的思路问题,还是说这个模型过于复杂无法运行在ESP32S3呢
TFLite Micro prefers int8, not uint8. Ultralytics full_integer_quant is int8 by design (weights, activations, and I/O). You can feed your uint8 image by applying the tensor’s quant params (scale/zero_point, typically zero_point≈128) and casting to int8 on-device.
Export the MCU-friendly model directly from Ultralytics with INT8 calibration and a much smaller, fixed input size:
from ultralytics import YOLO
YOLO("yolo11n.pt").export(format="tflite", int8=True, imgsz=160, nms=False, data="your_dataset.yaml")
This produces yolo11n_full_integer_quant.tflite. Details are in the TFLite export guide at the Ultralytics docs.
That said, YOLO11n is still heavy for ESP32S3; ~13 s/inference is expected. To squeeze more speed: keep imgsz at 128–160, build esp-tflite-micro with esp-nn optimized kernels enabled, and use a Release/O3 build. Real-time typically requires an accelerator or a stronger edge device; for example, see our Coral Edge TPU on Raspberry Pi guide for a practical path to realtime.