Clarification on YOLOv8-Pose Visibility Flags (0,1,2) – Do Occluded Keypoints Affect Training?

ufouk_zn · March 3, 2025, 3:32pm

Greetings,

I have been working on a custom pose estimation task where I defined a custom skeleton for human keypoints. My goal is to train an auto-annotator model to help label the rest of my dataset more efficiently.

To ensure high-quality annotations, I manually labeled a subset of images using CVAT.ai and followed the standard visibility convention:

0 = Out-of-view (keypoint is not visible and not labeled)
1 = Occluded (keypoint is present but not visible)
2 = Fully visible

During this process, I was very careful to switch the visibility flag to 1 for occluded keypoints—for example, if a human’s shoulder, elbow, or hand was blocked by another object or the human’s own body, but still present, I assigned it as occluded (1) instead of visible (2).

After training my YOLOv8-Pose model on this annotated subset, I used it to auto-annotate the rest of the dataset. However, when inspecting the output, I noticed somethin:

The model does not seem to predict any occluded keypoints (1).
Instead, it only outputs fully visible keypoints (2) or out-of-view keypoints (0).
I checked the results.keypoints.data object and confirmed that all keypoints were either marked as visible with a score of 0.999… or out-of-view (with coordinates 0,0 and a score between 0.02 and 0.5), with no predictions for 1 (occluded) at all.

I conducted a thorough research on whether the occlusion flag (1) has any effect on training in YOLOv8-Pose. However, I found conflicting information:

Some sources indicate that visibility flags (1 vs. 2) do not affect training, meaning all labeled keypoints are treated equally regardless of visibility.
Other sources state that occluded keypoints (1) do contribute to training and loss calculations (OKS-based loss).

This leads to my question:

Does setting a keypoint’s visibility to 1 (occluded) have any effect on the training process in YOLOv8-Pose?
Or does the model treat occluded keypoints the same as visible keypoints (2) during optimization?

This is extremely important for me because:

If occlusion labels matter, I will continue manually switching occluded keypoints from 2 to 1 while fine-tuning my auto-annotated dataset.
However, if occlusion labels do not impact training, I can skip this tedious manual process, saving a huge amount of time.

Can you confirm how YOLOv8-Pose actually handles occluded keypoints (1) during training? Does it use OKS loss with different weighting for occlusions, or does it simply treat all labeled keypoints equally? In other words, if I skip the manual switching of the occluded flag on my auto-annotated dataset where needed, would this give me a problematic outcome?

I am aware that this question was asked before on Ultralytics Github repo discussion section by different users under #6945 and #3409 but the responses from Ultralytics to these inquiries seemed a bit contradicting, or I misinterpreted those responses. While one say “no” the other seems to say “yes”.

As far as I understand, if the architecture uses the OKS metric during training, then occlusion should come into play, and OKS is mentioned in the Ultralytics documentation but still not very clear if it actually uses a different weighting mechanism for the occluded keypoints.

Thank you in advance for the help.

Toxite · March 3, 2025, 5:41pm

Ultralytics only uses 0 (not visible) and 1 (visible). There’s no special handling for occluded. Occluded should be marked as not visible

JJrodny · May 7, 2025, 7:37pm

Hi! Thanks for your response! I’m working with COCO pose with Ultralytics, and I downloaded the data by setting data=coco-pose.yaml in the cli and looking at the data it downloaded, it’s tagged with values of 0, 1, and 2. here’s an example line in “000000232692.txt” in val2017:

0 0.511922 0.306035 0.222969 0.399600 0.501563 0.167059 2.000000 0.506250 0.157647 2.000000 0.495312 0.155294 2.000000 0.000000 0.000000 0.000000 0.479687 0.148235 1.000000 0.512500 0.192941 2.000000 0.453125 0.190588 2.000000 0.556250 0.221176 2.000000 0.407813 0.228235 1.000000 0.593750 0.237647 2.000000 0.407813 0.214118 2.000000 0.485938 0.315294 2.000000 0.448437 0.312941 2.000000 0.518750 0.355294 2.000000 0.446875 0.381176 2.000000 0.550000 0.454118 2.000000 0.445312 0.487059 2.000000

Does this mean 0 is null/not-present, 1 is not visible, and 2 is visible?

Thanks!
-Jeff

Toxite · May 7, 2025, 7:43pm

1 and 2 are the same in Ultralytics. 0 means it will not be part of training at all. While 1 and 2 would be part of training and is supposed to be predicted by the model as visible.

Bishop · May 8, 2025, 7:18am

I’ve been working with YOLOv8-Pose and encountered a similar situation. From my experience, the visibility flags (0: not labeled, 1: labeled but not visible, 2: labeled and visible) are primarily used for evaluation and visualization purposes. During training, the model treats all labeled keypoints equally, regardless of their visibility status. So, setting a keypoint’s visibility to 1 (occluded) doesn’t impact the training process differently than setting it to 2 (visible). This understanding helped streamline my annotation process.

pderrenger · May 9, 2025, 12:38am

Hello! Thanks for the detailed question regarding keypoint visibility flags in YOLOv8-Pose.

During training, keypoints labeled with visibility 1 (occluded) and 2 (visible) are both considered present and contribute to the loss calculation. As seen in the calculate_keypoints_loss function within ultralytics.utils.loss.v8PoseLoss, a keypoint mask (kpt_mask) is generated based on whether the ground truth visibility flag gt_kpt[..., 2] is not equal to 0. This means that both occluded (1) and fully visible (2) keypoints are included when calculating the keypoint localization loss and the keypoint visibility/objectness loss.

The model learns to predict keypoint locations and a continuous visibility score. Your observation that the model outputs values corresponding to “visible” (high confidence, like your “2”) or “out-of-view” (low confidence, like your “0”) is because it doesn’t discretely predict “1” for occluded. Instead, it predicts a confidence score, and if a keypoint is labeled 1 or 2 in the ground truth, the model is trained to predict it with high confidence.

Therefore, labeling occluded keypoints as 1 is indeed important. It signals to the model that a keypoint exists at that location and should be learned, even if occluded. If these were labeled 0, they would be ignored during training. So, continuing to manually ensure occluded keypoints are marked as 1 is beneficial for robust training.

Topic		Replies	Views
New Release: Ultralytics v8.3.160 Discussion releases , announcements , ultralytics-official	0	19	June 27, 2025
How does YOLOv8 reduce false recognition? How does it reduce false recognition of similar objects YOLO yolov8 , question	14	96	July 12, 2025
About Yolo Configuration File (YAML) Discussion discussion	2	365	September 30, 2024
Difference in Keypoint Coordinate Formats Between YOLOv8 and YOLO11 TFLite Models Support question	8	214	November 21, 2024
Yolov8m pose - post-processing error after exporting to onnx and tensorrt YOLO yolov8 , bug-fix	5	24	July 25, 2025

Clarification on YOLOv8-Pose Visibility Flags (0,1,2) – Do Occluded Keypoints Affect Training?

Related topics