Speed up inference time for Live Inference with Streamlit Application using Ultralytics YOLO11

Suppose that I have a custom model trained with YOLO12x, then I have applied this model to the Live Inference with Streamlit Application using Ultralytics YOLO11 to detect some kinds of objects in the videos, but the inference time for each frame in the videos is extremely slow. Therefore, what should I do to improve it? Thank you very much.

YOLO12x is a very large model. If you want better speeds, you should use a smaller model like YOLO11m.

You can also export to TensorRT if you have NVIDIA GPU or OpenVINO if you don’t have a GPU.

@Toxite How can I import the model in the formats like OpenVINO and ONNX to the Live Inference with Streamlit Application using Ultralytics YOLO11?

Afte you export it, you just use the exported model in place of best.pt when launching the Streamlit application. At the end of the export, it shows you the name of the exported model and it’s location that you need to use.

@Toxite Can you give me all the model formats which only support CPU inference?

You can find all the formats here:

You can click on each to get details about them. Some are hardware specific, others are general.

@Toxite When I tried to launch the Live Inference with Streamlit Application using the command “yolo solutions inference model=”\path_to_model\best.onnx"", I got the following error: “FileNotFoundError: [Errno 2] No such file or directory: ‘\path_to_model\best.onnx.pt’”. What should I do to solve this problem?
(Note: “\path_to_model\best.onnx” is the absolute path to the onnx model.)

Hello! Thanks for the detailed question.

The error FileNotFoundError: ...best.onnx.pt occurs because the solutions script is designed to work with PyTorch models and automatically appends a .pt extension to the model path you provide.

To use an exported model like ONNX for live inference, the most effective approach is to create a custom Python script for your Streamlit application. You can use the official Live Inference with Streamlit guide as a reference. The main difference will be how you load the model.

Here’s a minimal example to get you started. You can load your ONNX model directly using YOLO() and then build the Streamlit interface around it to process a live video feed.

from ultralytics import YOLO
import streamlit as st
import cv2

# Load your exported ONNX model
model = YOLO('path/to/your/best.onnx')

st.title('YOLO ONNX Live Inference')
run = st.checkbox('Run Inference')
FRAME_WINDOW = st.image([])

cap = cv2.VideoCapture(0)  # Use 0 for webcam

while run:
    ret, frame = cap.read()
    if not ret:
        st.write("Failed to capture image from camera.")
        break
    
    # Run inference on the frame
    results = model(frame)
    annotated_frame = results[0].plot()
    
    # Display the annotated frame
    FRAME_WINDOW.image(annotated_frame, channels="BGR")

cap.release()
st.write('Inference stopped.')

This script provides a basic framework for running your ONNX model. You can expand upon it to add features like video file uploads or other UI controls.

I hope this helps

@pderrenger Can I open a feature request to add the onnx model support for Live Inference with Streamlit Application on GitHub?

Hello! Thanks for reaching out.

The FileNotFoundError you’re seeing happens because the yolo solutions inference command is primarily designed for PyTorch (.pt) models and automatically appends the .pt extension to the model path. This behavior is part of the solutions.Inference class, which is not set up to handle other formats like ONNX directly from the command line.

As a workaround, you can create your own Python script to build the Streamlit application. This gives you direct control over loading your exported ONNX model.

Here’s a minimal example to get you started:

import streamlit as st
import cv2
from ultralytics import YOLO

# Load your ONNX model
# Use a raw string (r"...") or forward slashes for the path on Windows
try:
    model = YOLO(r"C:\path_to_model\best.onnx")
except Exception as e:
    st.error(f"Error loading the ONNX model: {e}")
    st.stop()

st.title("Live Inference with Streamlit and ONNX")

# Initialize webcam
cap = cv2.VideoCapture(0)
frame_placeholder = st.empty()
stop_button = st.button("Stop Inference")

while cap.isOpened() and not stop_button:
    success, frame = cap.read()
    if not success:
        st.warning("Failed to read frame from webcam.")
        break

    # Perform inference
    results = model(frame)
    annotated_frame = results[0].plot()

    # Display results
    frame_placeholder.image(annotated_frame, channels="BGR", use_column_width=True)

cap.release()

You can run this script from your terminal with streamlit run your_script_name.py. This approach will correctly load and use your best.onnx file for live inference.

I hope this helps you move forward with your project