I apparently haven’t been able to find some examples for all the different visual inputs and the code. I obviously understand that this very basic for all of you. I tried following a few youtube videos and have only gotten a photo and my webcam to work.
so how do you tell yolo to look at a:
photo - this worked but is it correct? results = model(“messy desk.jpg”)
video - I believe I’ve seen that it needs to be mp4 file type
webcam - this worked but is it correct? results = model.track(0, save=True, show=True,
conf=0.2)
RTSP stream -
can you do a youtube video?
I apricate your time and sorry for the simple question. I’m trying to do this in Ubuntu in virtual environment using python.
FYI, just below the table on the page that Toxite linked to, you will find code examples for all inference sources. That’s the best way to check for how to use “x” as a source.
photo - this worked but is it correct? results = model(“messy desk.jpg”)
Yes, but as the Zen of Python states, “Explicit is better than implicit.” so, it might be better to use:
results = model.predict("messy desk.jpg")
video - I believe I’ve seen that it needs to be mp4 file type
All supported formats can be found on the Predict page of the docs that Toxite linked above, but if you want to jump directly to that section, it’s here.
webcam - this worked but is it correct? results = model.track(0, save=True, show=True,
conf=0.2)
Yes. Using model.track(0) or model.predict(0) will generally access your webcam, however if there are multiple cameras connected, it could be a different number. See this tutorial and related documentation from OpenCV (what’s used to access the webcam stream).
RTSP stream -
Here it would be the same as any other URL or input object (again, definitely check out the documentation), however
# NOTE: not a real RTSP URL
stream = model.predict(
"rtsp://example.com/sample-rtsp-stream",
stream=True, # must iterate to get results
)
results = []
for data in stream:
results.append(data) # consumes resources quickly, not advised