YOLO model for detecting cats, foxes, and dogs in urban gardens

Hi All,

I am very new to Ultralytics and YOLO. I am aware there are many forks in the Ultralytics GitHub library. Has anyone worked on a model that can detect animals such as cats, foxes, and dogs in urban settings, for example, a garden? I believe there are multiple forks in the github, so I was wondering if someone has started such a model?

I am looking for a starting point, as the standard model does not always seem to detect every variation.

You can try YOLOE with prompts:

https://docs.ultralytics.com/models/yoloe/#__tabbed_2_1

But the prompts depend on the model first detecting the object?

If you mean that it detects all objects first and then filters based on prompt, then no. It detects based on the prompt.

But does the model need to have been trained on the original object first? For example, if I search for a fox but ‘fox’ wasn’t included in the model’s training data, then it won’t find it—even with prompts, right?

The YOLOE model uses text prompts like the words “fox” and/or “vulpes” to allow the model to detect objects that have activations that are similar to the vectorization of these words. It’s a bit complex if you’re new to the ideas of machine learning and computer vision, but I would recommend checking out the docs page on the YOLOE model for more details.

You can also just try it out with some example images to see if it works as you expect. The ultralytics library should be fairly simple to use for testing your use case, and the documentation pages have a lot of walk thrus and info to help you out. It’s very common that when questions are asked like, “can YOLO do X” or “will this work for Y” that the answer will be, “you have to test it out” b/c it’s not common that someone else will have experience with your exact use case or situation. If you get stuck or run into issues, feel free to ask here or any of our other communities.

@BurhanQ Thank you for your reply

I will read this document.

Can I try with a video stream from my Raspberry Pi?

So by typing a prompt, does this model sort of self-train itself, or would these prompts we type be somehow pre-trained in the model? For example, if we were to specify an object that has never been trained, how would it work?

YOLOE is an open-vocabulary model. It was designed and trained to detect objects based on prompts.

You will have to read about open-vocabulary models to understand how they work. There’s no “self-training”. You’re thinking about traditional closed set models which are trained on specific classes. Open-vocabulary models are designed to be able to detect classes they weren’t specifically trained on.

If you want to understand how they work, you will have to read the YOLOE paper.

@Toxite Thanks I i will read up on it, but there must be some limits on what can be detected with the prompts?

There are obviously limits, especially for entirely different domains than what the pretraining was for. But it’s not as limited as closed set models.

@Toxite @BurhanQ When running YOLE on RP, do we natively run the .pt file or do we need to convert it to another format, which we then use?

Also, I have the following camera connected to my RP

Waveshare IMX290-83 IR-Cut Camera Compatible with Raspberry Pi Board and Module Series Using a IMX290 Starlight Camera Sensor with 2MP onboard IR-CUT Switch the Modes Between Daytime and Nighttime: Amazon.co.uk: Computers & Accessories

In my other project, i refer to the camera using input="libcamera", that compatible with the Ultralytics Python libraries?

You can both run .pt or convert it to another format.

From the search online, you can’t directly fetch frames from that camera using OpenCV.

You need to manually get the frames from that and run model.predict()on it to make inference.

I think you should attempt running the model first and then ask questions when you are having issues, instead of asking questions that could have been answered if you had just attempted it. Also a lot of these general questions can be answered through a simple Google research or by reading Ultralytics Docs.