Hi,
I have recently tried training 20 different verions of YOLO8s on the exact same dataset. My training parameters included deterministic = False and dropout = 0.5, with freeze = 10. Despite these settings, each training run yielded the exact same performance at each epoch, and the final precision/ confidence and recall/ confidence curves were identical for each model (to within 10 significant figures). How is this possible without deterministic training and with random dropout?
Cheers,
Kirby
Hi Kirby,
Thanks for reaching out! It’s intriguing that you’re seeing identical results across multiple training runs, especially with deterministic=False
and dropout=0.5
. Here are a few potential reasons and suggestions to investigate further:
-
Random Seed: Even with
deterministic=False
, if a random seed is set (e.g.,seed=0
), it can lead to reproducible results. Ensure that no seed is being set explicitly or implicitly in your training script. -
Data Loading: If your dataset is being loaded in a deterministic manner (e.g., shuffling is disabled or the data loader is not randomized), this could lead to identical training batches across runs. Verify that your data loader is configured to shuffle the data.
-
Dropout Implementation: While dropout is intended to introduce randomness, ensure that it is correctly implemented and active during training. You might want to double-check the dropout layers in your model configuration.
-
Freezing Layers: With
freeze=10
, the first 10 layers of your model are not being updated during training. If these layers are responsible for significant feature extraction, the remaining layers might not introduce enough variability. Consider reducing the number of frozen layers to see if it affects the outcome. -
Environment and Dependencies: Ensure that your training environment (e.g., hardware, software versions) is not inadvertently causing deterministic behavior. Sometimes, specific versions of libraries or hardware configurations can lead to unexpected reproducibility.
To further diagnose, you can try the following:
- Explicitly set
deterministic=True
and observe if there’s any change. - Introduce additional randomness in data augmentation or initialization.
- Check the randomness in dropout by inspecting intermediate outputs during training.
For more detailed guidance, you can refer to the Ultralytics YOLOv8 documentation.
If the issue persists, please ensure you’re using the latest version of the Ultralytics package. If the problem continues, feel free to open a bug report on our GitHub Issues page.
Hope this helps!
The dropout
hyperparameter is only used for the classify
task, which you can see in the cfg/default.yaml
view on GitHub and this is also reflected in the train
settings table in the documentation. Additionally, if there is any “random” values, the seed
for the randomness would need to be changed to another value to get a “different” random outcome.