I’m trying to train a custom model with the following labels:
- Person (already in coco)
- Bird (already in coco)
- Squirrel (not in coco)
I was planning to use one of the squirrel datasets from roboflow universe in order to train on squirrels. I would use Yolov5 v.6.0, originally trained on coco.
I don’t see a way to train on a custom dataset of squirrels and tell it to also remember the other 2 labels. Is the default behavior of fine-tuning a model to forget all existing labels and only consider new training labels?
Here’s a few options I see - would appreciate any suggestions!
Option 1: fine-tune on squirrels + existing 2 coco categories
Extract the Birds and Person images + labels from the coco dataset, combine with the squirrels images + labels, and then retrain on the existing pre-trained weights. Based on this advice, I might reduce the number of Birds and Person images used from the original coco training set in order to balance out with the number of squirrel images.
Best option?
Option 2: fine-tune purely on custom datasets
Ignore existing coco dataset and instead find and use datasets in roboflow universe for Birds, Persons and Squirrels, and combine into a single custom dataset for training, and do fine-tuning on the existing pre-trained weights. This might have some advantages if I can find existing custom datasets for Birds and Person that match my use case better than the coco images.
Option 3: fine-tune on squirrel custom dataset, combine with existing other 80 coco labels
I think it’s possible train on both the original coco and the custom dataset (example), while keeping all the other 80 coco labels. However, wouldn’t this reduce the accuracy compared to a model that was only trained to detect 3 labels?
Option 4: retrain from scratch
Since I only care about these 3 labels, should I retrain coco from scratch using a subset of the coco labels + data and the new squirrel labels + custom data? This seems like it would take a long time, so it would be better to do a fine-tuning approach instead.