Speed up inference time for Live Inference with Streamlit Application using Ultralytics YOLO11

Suppose that I have a custom model trained with YOLO12x, then I have applied this model to the Live Inference with Streamlit Application using Ultralytics YOLO11 to detect some kinds of objects in the videos, but the inference time for each frame in the videos is extremely slow. Therefore, what should I do to improve it? Thank you very much.

YOLO12x is a very large model. If you want better speeds, you should use a smaller model like YOLO11m.

You can also export to TensorRT if you have NVIDIA GPU or OpenVINO if you don’t have a GPU.

@Toxite How can I import the model in the formats like OpenVINO and ONNX to the Live Inference with Streamlit Application using Ultralytics YOLO11?

Afte you export it, you just use the exported model in place of best.pt when launching the Streamlit application. At the end of the export, it shows you the name of the exported model and it’s location that you need to use.

@Toxite Can you give me all the model formats which only support CPU inference?

You can find all the formats here:

You can click on each to get details about them. Some are hardware specific, others are general.