Steps involved
No model fine-tuning is needed, I have used the pretrained YOLOv8s-world model.
You can set class names manually after loading the model.
I used the video from the OpenAI Sora demo for testing.
classes = [“Elephant walking”, “Elephant standing”]
Learn more YOLO-World (Real-Time Open-Vocabulary Object Detection) - Ultralytics YOLO Docs
Complete code is available in the comments