While this is a short read it does promise that I could see how ultralytics YOLO11 could help. Now, I have looked around but can’t seem to find how exactly this could work. Are there any examples on doing exactly this?
It’s doable if the camera is fixed. You can train YOLO11 pose model to get the keypoints of the needle to find the direction it’s pointing towards and then combine that with the predefined markers for the value.
From the article I had hoped for a concrete example. After reading it again I now see it in a different light: its plausible, but the author has not tried it nor provided links to examples. Seems like pure marketing.
Anyway, the idea of first training a model to recognise the gauges seems easy enough. Actually those kind of dials shown in the image I refer to above are recognised as “clocks” with a certainty of close to 1.
The next step in the article is using instance segmentation, however as far as I can understand on this video Instance Segmentation - Ultralytics YOLO Docs it not only detects and draws the surrounding box and label, but masks out the area the object actually takes up and then mask out the surroundings. While this might be a useful step, it is unclear to me how to split up the dial object into smaller items.
Now, more to your point @Toxite, a pose model could probably be used to figure out where the hand points and then ocr read the number at the end of that. While I have a good idea of how to ocr read the number once found, I have no idea of how to train a pose model for this kind of use.
If anyone has more insight and examples please advice.
You need to label the keypoints on the image using a labeling tool. Like labeling the tail and head if the needle. You can look for keypoint labeling tool. CVAT for example supports keypoint labeling. Once you have the dataset, you just train the model.