Best practice for dataset creation

Hello all,

I’m creating a dataset for detecting eliphant,Rhino in thermal images. on my first trial of model the model was detecting eliphant like sign board or eliphant like any other pattern as eliphant , so for avoiding it i have taken those images and while annotation i have marked them as null (explicitely) , and after training again i’m getting same issue , what am I doing wrong ?

I have read the below blog also for differences

should i mark it null explicitely while annotating or just leave those images so that my model knows to avoid those images.

1. I’m training yolo11 models

Can you post the logs from training?

This is the log from kaggle.

box F1 score all classes 0.96 at .41 conf
box PR score all classes 0.98 at map@ 0.5
box recall confidence curve 1 at 0.0
box precision all class 1 at 0.93

Label Distribution is a bit imbalanced.

  • Rhinos (~2500 instances) vs Elephants (~1000 instances).

Can you show the start of the logs? Where it shows number of images.

If you added null images, it should show their number as “background”

@Toxite Thanks for your reply , here’s the start log of my training please check.

You mentioned that you’re attempting to detect objects in thermal images. Are all the images in the training and validation sets thermal images? …including the background images?

The metrics you’re showing appear to be quite decent, so it makes me think that something else is awry with the data. It might be helpful to share some example images from training and testing (where you observe the false positives).

@BurhanQ Thanks for your reply, My dataset includes all the thermal images even the background ones.
Below patten is getting detected as eliphant.

  1. This below image is included in training (in null images )

That’s quite strange. I presume that the image with an elephant appears significantly different?