Speed up inference time for Live Inference with Streamlit Application using Ultralytics YOLO11

tommerfrancis · July 20, 2025, 6:38am

Suppose that I have a custom model trained with YOLO12x, then I have applied this model to the Live Inference with Streamlit Application using Ultralytics YOLO11 to detect some kinds of objects in the videos, but the inference time for each frame in the videos is extremely slow. Therefore, what should I do to improve it? Thank you very much.

Toxite · July 20, 2025, 7:00am

YOLO12x is a very large model. If you want better speeds, you should use a smaller model like YOLO11m.

You can also export to TensorRT if you have NVIDIA GPU or OpenVINO if you don’t have a GPU.

tommerfrancis · July 20, 2025, 7:27am

@Toxite How can I import the model in the formats like OpenVINO and ONNX to the Live Inference with Streamlit Application using Ultralytics YOLO11?

Toxite · July 20, 2025, 8:12am

Afte you export it, you just use the exported model in place of best.pt when launching the Streamlit application. At the end of the export, it shows you the name of the exported model and it’s location that you need to use.

tommerfrancis · July 20, 2025, 8:25am

@Toxite Can you give me all the model formats which only support CPU inference?

Toxite · July 20, 2025, 11:33am

You can find all the formats here:

You can click on each to get details about them. Some are hardware specific, others are general.

tommerfrancis · July 24, 2025, 12:16pm

@Toxite When I tried to launch the Live Inference with Streamlit Application using the command “yolo solutions inference model=”\path_to_model\best.onnx"", I got the following error: “FileNotFoundError: [Errno 2] No such file or directory: ‘\path_to_model\best.onnx.pt’”. What should I do to solve this problem?
(Note: “\path_to_model\best.onnx” is the absolute path to the onnx model.)

pderrenger · July 25, 2025, 12:43am

Hello! Thanks for the detailed question.

The error FileNotFoundError: ...best.onnx.pt occurs because the solutions script is designed to work with PyTorch models and automatically appends a .pt extension to the model path you provide.

To use an exported model like ONNX for live inference, the most effective approach is to create a custom Python script for your Streamlit application. You can use the official Live Inference with Streamlit guide as a reference. The main difference will be how you load the model.

Here’s a minimal example to get you started. You can load your ONNX model directly using YOLO() and then build the Streamlit interface around it to process a live video feed.

from ultralytics import YOLO
import streamlit as st
import cv2

# Load your exported ONNX model
model = YOLO('path/to/your/best.onnx')

st.title('YOLO ONNX Live Inference')
run = st.checkbox('Run Inference')
FRAME_WINDOW = st.image([])

cap = cv2.VideoCapture(0)  # Use 0 for webcam

while run:
    ret, frame = cap.read()
    if not ret:
        st.write("Failed to capture image from camera.")
        break
    
    # Run inference on the frame
    results = model(frame)
    annotated_frame = results[0].plot()
    
    # Display the annotated frame
    FRAME_WINDOW.image(annotated_frame, channels="BGR")

cap.release()
st.write('Inference stopped.')

This script provides a basic framework for running your ONNX model. You can expand upon it to add features like video file uploads or other UI controls.

I hope this helps

tommerfrancis · July 25, 2025, 7:37am

@pderrenger Can I open a feature request to add the onnx model support for Live Inference with Streamlit Application on GitHub?

pderrenger · July 26, 2025, 12:41am

Hello! Thanks for reaching out.

The FileNotFoundError you’re seeing happens because the yolo solutions inference command is primarily designed for PyTorch (.pt) models and automatically appends the .pt extension to the model path. This behavior is part of the solutions.Inference class, which is not set up to handle other formats like ONNX directly from the command line.

As a workaround, you can create your own Python script to build the Streamlit application. This gives you direct control over loading your exported ONNX model.

Here’s a minimal example to get you started:

import streamlit as st
import cv2
from ultralytics import YOLO

# Load your ONNX model
# Use a raw string (r"...") or forward slashes for the path on Windows
try:
    model = YOLO(r"C:\path_to_model\best.onnx")
except Exception as e:
    st.error(f"Error loading the ONNX model: {e}")
    st.stop()

st.title("Live Inference with Streamlit and ONNX")

# Initialize webcam
cap = cv2.VideoCapture(0)
frame_placeholder = st.empty()
stop_button = st.button("Stop Inference")

while cap.isOpened() and not stop_button:
    success, frame = cap.read()
    if not success:
        st.warning("Failed to read frame from webcam.")
        break

    # Perform inference
    results = model(frame)
    annotated_frame = results[0].plot()

    # Display results
    frame_placeholder.image(annotated_frame, channels="BGR", use_column_width=True)

cap.release()

You can run this script from your terminal with streamlit run your_script_name.py. This approach will correctly load and use your best.onnx file for live inference.

I hope this helps you move forward with your project

Topic		Replies	Views
Multiple model formats support for the Live Inference with Streamlit Application using Ultralytics YOLO11 YOLO	3	49	July 26, 2025
Streamlit live inference solution released Ultralytics 8.2.50 Discussion yolo , ai , support , ultralytics-official	3	82	July 12, 2024
Regarding to the Live Inference with Streamlit Application using Ultralytics YOLO11 YOLO	2	22	July 16, 2025
New Release: Ultralytics v8.3.54 Discussion releases , announcements , ultralytics-official	0	40	December 24, 2024
Yolo 11 python code for sources YOLO code	2	191	February 14, 2025

Speed up inference time for Live Inference with Streamlit Application using Ultralytics YOLO11

Related topics