Need advice

Hi there!

I need an advice.

I have a task to count the passangeers in the bus. I have a camera looking vertically down at the door place. (Pictures attached)

I have my traned model with a single class. So, in my opinion, I can neglect such metrics as BoxLost, Class Lost and DFL Lost (Am I right?)

My training arguments is

task: detect

mode: train

model: /content/drive/MyDrive/YOLO_detection/trainCfg-1000-16_DayPlusNight_mozaic_Y11s_runs/train/weights/last.pt

data: /content/data.yaml

epochs: 1000

time: null

patience: 50

batch: 30

imgsz: 640

save: true

save_period: 50

cache: disk

device: null

workers: 32

project: /content/drive/MyDrive/YOLO_detection/trainCfg-1000-16_DayPlusNight_mozaic_Y11s_runs

name: train

exist_ok: false

pretrained: yolo11s.pt

optimizer: auto

verbose: true

seed: 0

deterministic: true

single_cls: true

rect: false

cos_lr: true

close_mosaic: 10

resume: /content/drive/MyDrive/YOLO_detection/trainCfg-1000-16_DayPlusNight_mozaic_Y11s_runs/train/weights/last.pt

amp: true

fraction: 1.0

profile: false

freeze: null

multi_scale: false

compile: false

overlap_mask: true

mask_ratio: 4

dropout: 0.0

val: true

split: val

save_json: false

conf: null

iou: 0.7

max_det: 300

half: false

dnn: false

plots: true

source: null

vid_stride: 1

stream_buffer: false

visualize: false

augment: true

agnostic_nms: false

classes: null

retina_masks: false

embed: null

show: false

save_frames: false

save_txt: false

save_conf: false

save_crop: false

show_labels: true

show_conf: true

show_boxes: true

line_width: null

format: torchscript

keras: false

optimize: false

int8: false

dynamic: false

simplify: true

opset: null

workspace: null

nms: false

lr0: 0.01

lrf: 0.01

momentum: 0.937

weight_decay: 0.0005

warmup_epochs: 3.0

warmup_momentum: 0.8

warmup_bias_lr: 0.0

box: 0.1

cls: 0.1

dfl: 0.1

pose: 12.0

kobj: 1.0

nbs: 64

hsv_h: 0.015

hsv_s: 0.7

hsv_v: 0.4

degrees: 15.0

translate: 0.5

scale: 0.5

shear: 0.0

perspective: 0.0

flipud: 0.0

fliplr: 0.5

bgr: 0.0

mosaic: 1.0

mixup: 0.15

cutmix: 0.0

copy_paste: 0.3

copy_paste_mode: flip

auto_augment: randaugment

erasing: 0.0

cfg: null

tracker: botsort.yaml

After 500 epoches I have metrics:

Preccision - 0.934

Recall - 0.941

mAP50 - 0.973

mAP50-95 - 0.77

The accuracy on a real video is about 92%

I want to know:

Is this metrics already at its limit or can it be improved?

If can be, what arguments and how I should change?

Thank you for advance!

It’s not useful to dump all the training arguments because it’s hard to tell what were changed and what were kept as default. You should post only the arguments that were updated.

Do you just have single camera? You should get the undistorted image and train the model on those. It will probably work better. During inference, you run the inference on frames after undistorting them.

Thank you!

And YES, I have single camera.