Understanding Keypoint Decode

Daan_Seuntjens · October 16, 2024, 1:03pm

In keypoint decode, we apply:
kpts_out * 2 + anchor - 0.5

I assume -0.5 is to center to the middle of a pixel. However, I do not understand why in all decoding functions of keypoints, a factor *2 is applied? Seems like an unnecessary operation?

pderrenger · October 16, 2024, 4:28pm

Hello!

Great question! The factor of *2 in the keypoint decoding process is indeed related to scaling the keypoint coordinates. This operation helps adjust the coordinates from a normalized range to the actual pixel space of the image.

The -0.5 adjustment centers the keypoint to the middle of a pixel, as you correctly assumed. This is a common practice in computer vision to ensure more precise localization.

For a deeper dive into how keypoints are handled, you might find the Ultralytics Pose Estimation documentation helpful. It provides insights into the model architecture and processing steps.

If you have more questions or need further clarification, feel free to ask!

Daan_Seuntjens · October 17, 2024, 9:24am

Hi

Thanks for the quick reply!
If you’re open to it, I’d like to dive deeper into the following:

Since the input values are outputs from a nn.Conv2d layer without further activation or processing, why couldn’t the *2 factor be learned by the network and thus avoid an additional operation? I don’t fully understand why the keypoints require a *2 factor for denormalization before decoding.

Any insights are welcome, and feel free to get technical!

BurhanQ · October 17, 2024, 1:59pm

@Daan_Seuntjens I don’t know exactly why the points are scaled by 2, but suspect it’s similar to what was done in YOLOv5 for the bounding box centerpoint prediction, which was to help address a few issues.

The point offset range is adjusted from (0, 1) to (-0.5, 1.5). Therefore, offset can easily get 0 or 1.

So instead of being bound to the range [0, 1] the values are shifted and scaled to [-0.5, 1.5] which was supposed to help reduce grid sensitivity for predictions. It’s not directly answering your question, but you can see some of the discussion around it on this issue Want to figure out critical algorithm of Detect layer · Issue #471 · ultralytics/yolov5 · GitHub which might give you a bit more context about the formula and how it was decided on.

Daan_Seuntjens · October 18, 2024, 8:33am

This appears correct. YOLOv5 normalizes the keypoint outputs to the range 0–1 using a sigmoid, so applying *2 - 0.5 makes sense. However, in YOLOv8, the keypoint model output is not normalized. This seems like a remnant of YOLOv5’s legacy code. Therefore, omitting the *2 - 0.5 step in YOLOv8 could save some computational resources, albeit insignificantly.

BurhanQ · October 18, 2024, 2:29pm

@Daan_Seuntjens you may be correct that it could be removed, and if you have the time/interest, it would be good to test if removing it has any impact on model performance. If it doesn’t, you could open up a PR and as long as all the tests pass, I suspect the Team would be happy to merge it since removing computational or redundant steps is greatly valued.

A quick look at the code and the formula, I’m guessing that the strides and anchors in this section of the decode_keypoints() method could be a part of why the constant 2.0 is multiplied on the keypoints. Easiest way to know for certain would be to change it to 1.0 or remove completely to see if prediction, validation, and/or training performance changes.

Daan_Seuntjens · October 22, 2024, 1:29pm

Will do some experiments and open a PR if relevant. May be a week or 2 before this is finished. Many thanks for the insights!

Topic		Replies	Views
YOLOv11 pose YOLO pose , support , code	2	125	December 4, 2024
Difference in Keypoint Coordinate Formats Between YOLOv8 and YOLO11 TFLite Models Support question	8	137	November 21, 2024
Out of bounds coordinates Support yolo , question , support	6	46	April 15, 2025
Clarification on YOLOv8-Pose Visibility Flags (0,1,2) – Do Occluded Keypoints Affect Training? Support support	5	213	May 9, 2025
Pose Estimation - Key Points Outside Bounding Box YOLO question	6	240	February 12, 2025

Understanding Keypoint Decode

Related topics