I am training my YOLO to detect airplanes and drones. in some of the pictures it is impossible to detect that the object is indeed an airplane, and it even looks like a drone (pictures are taken from very far away), but I know from the context that it does. should I still label it as an airplane?
Thank you!
If you are certain it’s an airplane, then you should absolutely label it. While training, you should probably monitor performance of these images. The contextual knowledge you have about the “true” object is only really helpful if there is sufficient detail in the image for the model to make a distinction about the object.
As an example. If your image is 640 x 640 (natively) and the object area is 2 x 3 pixels with effectively all the same values, the model will likely struggle to correctly identify the object. This is because there is not enough information to distinguish this from any other object(s) with a high level of certainty. Of course, there are likely going to be some differences in the pixel values, which might (non-zero probability) make it possible for the model to identify, but you will only know from testing.
In your case, I would do the following:
- Label all instances you know 100% are airplanes
- While labeling, if it would be difficult for a person to identify if the object is an airplane, record the image name.
- If there are multiple objects in any of these images, you should note which line the “difficult” object is on.
- Split your dataset for training and validation, but try to ensure there is a good balance of “difficult” samples in both sets.
- If you have 100 total “difficult” object annotations, try to put ~80 in training and ~20 in validation.
- Train the model on all images and review the performance.
- Train the model again, with all the same settings as before, but temporarily remove the “difficult” annotations.
- Compare the results.
If the performance of the model is better when you exclude the “difficult” annotations, then you should train without them as it’s too challenging for the model to learn these. Ultimately, the fastest way to know if you should annotate these will be based on the requirements of your project. Consider the goal of the project and how relevant these “difficult” samples are to the end goal. Objects that appear very small in images, usually imply one of two characteristics; either the object is actually small or the object is far away.
For instance, if the goal of your project (hypothetical) is to identify if an object captured by a camera at an airport is either a drone or an airplane. If an airplane will resolve at a distance of 1 km, but is easily identifiable at 0.7 km, where a drone that is 200 x 200 mm is only easily identifiable at 50 m or closer, then the question would be, is that sufficient for whatever actions will be taken from such a detection? I have found that having answers to these types of questions will be the most important aspects for a computer vision project. If you can clearly define the overall aim/requirements for a project, it will help you understand how to best answer questions such as, “do I need to include these samples?”
One final thought is that I find it’s always best to annotate everything up front, as adding a few “extra” labels during the annotation process, and filtering them out during training, is much easier (and faster) than trying to add annotations later when something changes and you find out you need them.
1 Like
This is exactly what I needed. Thank you for your detailed answer.
I really appreciate it!
1 Like