I want to create a light weight model which can preferably run in the browser, and detect website logo from the website screenshots.
Here are the things that I have tried so far. Would be great if I can get some feedbacks on whether I am approaching this right.
I am exploring variants of YOLO for my use case. Since the model is trained on COCO dataset, I won’t be able to infer using zero shot. I will have to fine tune the model. I am using ultralytics APIs to train the model.
For dataset, I could not find similar dataset on internet, which has website screenshot with annotated logos, so I am thinking of creating one myself for top 100 websites. I am not sure if this data will be sufficient, but I can try to get started and see how the performance looks like.
I am using roboflow to annotate the images and the download the dataset so that I can train my YOLO model.
My questions are: Is this the right approach or are there better approach to this problem?