I’m trying to train the yolov8 model for my chatbot project. The purpose of the model is to detect the stock code of the products whose photos customers send.
But there are 1622 different stock codes and there are approximately 10 photos for each product. This is the hard part. But the easy part is that clients are already sending me the photos I use in the tutorial. For example, they take a screenshot of the product from our Instagram profile and send it to us. We will train the model with that photo anyway.
I train the model with 5 classes for testing purposes, the results are excellent. But when I train with the class of 1622, the result is almost 0.
I am open to any suggestions and support, please I am waiting for your help
hyp.yml
# Learning Rate ve Momentum Ayarları
lr0: 0.001 # Başlangıç öğrenme oranı
lrf: 0.01 # Final öğrenme oranı (lr0 ile çarpılır)
momentum: 0.85 # SGD momentum
weight_decay: 0.0005 # L2 regularizasyonu (weight decay)
warmup_epochs: 5.0 # Isınma epoch sayısı
warmup_momentum: 0.8 # Isınma süresince başlangıç momentumu
warmup_bias_lr: 0.1 # Isınma süresince bias için öğrenme oranı
batch: 10
epochs: 150
imgsz: 1280
# Kayıp Fonksiyonu (Loss Function) Ayarları
box: 0.05 # Box kaybı kazancı (GIoU/DIoU/CIoU)
cls: 1.0 # Sınıf kaybı kazancı
iou: 0.2 # IoU eşiği (labeling için)
kobj: 1.0 # Nesne kaybı kazancı
# Augmentation Ayarları (Veri artırma)
hsv_h: 0.005 # Görüntü HSV-Hue artırma (fraction) - Çok küçük değişiklikler
hsv_s: 0.1 # Görüntü HSV-Saturation artırma (fraction) - Çok küçük değişiklikler
hsv_v: 0.1 # Görüntü HSV-Value artırma (fraction) - Çok küçük değişiklikler
degrees: 2.0 # Görüntü döndürme (+/- derece)
translate: 0.1 # Görüntü kaydırma (+/- fraction)a
scale: 0.5 # Görüntü ölçekleme (+/- kazanç)
shear: 2.0 # Görüntü kaydırma (+/- derece)
perspective: 0.0 # Görüntü perspektifi (+/- fraction), 0-0.001 arası
flipud: 0.0 # Görüntüyü yukarıdan aşağıya çevirme (olasılık)
fliplr: 0.0 # Görüntüyü sağdan sola çevirme (olasılık)
mosaic: 1.0 # Mosaic artırma (olasılık) - Bu durumda kapalı
mixup: 1.0 # Mixup artırma (olasılık) - Bu durumda kapalı
copy_paste: 0.0 # Copy-paste artırma (olasılık) - Bu durumda kapalı
train.py
from ultralytics import YOLO
import yaml
import wandb
from wandb.integration.ultralytics import add_wandb_callback
# WandB oturumunu başlatın
if __name__ == "__main__":
wandb.login()
with open('hyp.yaml', 'r') as file:
hyperparameters = yaml.safe_load(file)
group = "yolov8l"
deneme="Alpha"
project="Alpha"
wandb.init(project=project, job_type="training", group=group , name=f"{group}{deneme}" , config=hyperparameters)
# Modeli yükleyin
model = YOLO(f"{group}.pt")
# WandB callback'ini ekleyin
add_wandb_callback(model, enable_model_checkpointing=False)
# Modeli eğitin
model.train(
data='y.yaml', # Veri kümesi yapılandırma dosyası
epochs=wandb.config['epochs'], # Eğitim epoch sayısı
batch=wandb.config['batch'], # Batch boyutu
lr0=wandb.config['lr0'],
momentum=wandb.config['momentum'],
weight_decay=wandb.config['weight_decay'],
project=f'/workspace/{project}', # Proje adı (varsayılan: runs/train)
name=f"{group}{deneme}", # Deneme adı (varsayılan: exp)
cfg='hyp.yaml', # Hyperparameters ayarları
imgsz=wandb.config['imgsz'],
rect=True,
plots=True,
)
model.val()
# WandB oturumunu sonlandırın
wandb.finish()
Is quite low. I understand that it’s less likely that there will be significant variation in the class instances, the image capture conditions will vary enough that it’s likely more images will be needed. You mention how sometimes there are screenshots taken of the product from Instagram, which might be a “special” case that you need to include in your training data. One of the main reasons that 10 images is too low, is that there are not a sufficient number of samples to use with validation. Generally, many will use 80/20 split for training and validation, which in your case would be 8/2 images. You will probably need to increase your per-class image count to somewhere around (at the absolute minimum) 100 images, but generally you’ll want to do even more than that; see the guidance on this page.
That said, you can get some robustness against variations using augmentations. I think that your values for shear and degrees might be able to be increased (assuming it’s likely there would be more than +/- 2 degrees of either during image capture), but you know what the images will look like best. You could also enable perspective augmentations.
Additionally, in most (if not all) cases, you should use rect=False as it’s best to train with the square images. There is no distortion to the images when made square, so there’s nearly no reason to use rect=True and I would highly recommend disabling this.
It would be helpful to see your results plots, as it’s difficult to give much more advice here; either sharing your W&B link or the plots output by ultralytics. Not knowing what kind of performance you’re observing over the duration of training makes it difficult to see where there could be issues. This would also help to see the number of epochs you’re training with, as I have seen some users setting this value too low, which will result in poor performance as well.
Oh, I see you also posted on the subreddit with one screenshot. From the looks of that screenshot, you’re training for 100 epochs, which is okay but for a robust model that will generalize, you’ll likely need to do more.
I don’t see the val/loss plots, but from the screenshot on Reddit, it appears the model is overfitting. This is likely due to the small number of samples per class. Again, upping the count of images per class is going to be your best bet. These should be unique and different images, not copies of the same image that have been slightly modified (if you want the best result).
I’m getting a 404 error for the W&B project and for HUB datasets, I don’t think (not 100% certain) that datasets are not sharable, so I won’t be able to view that page either.
I think the problem was with my parameters. When I uploaded my dataset to the Ultralytics Hub and trained from there, the model performed well. Most likely there was a problem with the rect parameter, as you call it image size or as you say. I transmit the new parameters:
hyp_last.yaml
Glad to hear @Sezer_Karatas! Looks like your model training is going a fair bit better now. Yeah, the rect argument can cause issues and whenever I see it enabled, I try to make sure it’s disabled. Another good test to keep in mind is that if you see something similar, try using all default arguments, as they tend to work well for most datasets.
Also, I removed it for you (your last post before the recent screenshot), but generally you don’t want to include your API keys (your HUB one was shown originally) online. You can go to your HUB → Settings → API Keys and delete the old one (just to be safe). That’s mostly to prevent someone from using your account (not a big deal at the moment, but could be something that costs you money in the future or on other platforms).