I’m training Yolo11x on a few thousand images (JPG). During each training run, I see warnings about some files being corrupt and then being fixed. The “corrupt” files open just fine so I am not sure what is being “fixed”. However, this in-place fixing breaks experiment reproducibility.
val: /mnt/devel/project/data/yolo/internal/1/images/val/I63273_I542557_039m40r3164.jpg: corrupt JPEG restored and saved
The “corrupt” files open just fine so I am not sure what is being “fixed”.
It would have replaced the original file with the fixed version, so it ought to open fine after that. The corruption it is fixing is truncated or partially downloaded JPEG files:
However, this in-place fixing breaks experiment reproducibility.
Are you sure that it’s due to this? It shouldn’t affect reproducibility because the fix occurs before the image is passed for training. Is your dataset being redownloaded every time? Or is it on a remote mounted drive? It shouldn’t be getting different images corrupted every training run, unless you’re redownloading it or it’s on a network drive.
To me, it seems unlikely that it’s the same image being repaired each time. The repair would fix the image, and save it, so the same image isn’t going appear corrupted again. Unless dvc is restoring it to the old version and corrupting it again. Also you didn’t answer my questions.
Are you sure that it’s due to this? It shouldn’t affect reproducibility because the fix occurs before the image is passed for training. Is your dataset being redownloaded every time? Or is it on a remote mounted drive? It shouldn’t be getting different images corrupted every training run, unless you’re redownloading it or it’s on a network drive.