Difference in Keypoint Coordinate Formats Between YOLOv8 and YOLO11 TFLite Models

When a Pose model trained with YOLOv8 is converted to TFLite and inference is performed in the TFLite environment, the keypoint coordinates are returned as absolute values. However, when the same process is done with YOLO11, the keypoint coordinates are output as relative values. Why does this happen, especially considering no arguments were passed during the PyTorch-to-TFLite export process?

Can you provide code and output examples?

Can you provide the output after running this command in terminal: yolo checks?

The x and y coordinates for the key points are shown as 08 and 09 respectively. The image on the left is from the YOLOv8 model, while the image on the right is from the YOLO11 model. Why are only the key points output in a different format?

Please share your inference code. Also, are you using the pretrained model or a custom trained model?

1 Like

I am using a custom-trained model

import tensorflow as tf
import numpy as np
from PIL import Image
import cv2

# 1. Load the TFLite model
tflite_model_path = "model_path"
interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
interpreter.allocate_tensors()

# Get the input and output details of the model
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print("Input details:", input_details)
print("Output details:", output_details)

# 2. Read and preprocess the input image
image_path = "image_path"
image = Image.open(image_path).convert('RGB')

# Get the input size expected by the model
input_shape = input_details[0]['shape']
input_height = input_shape[1]
input_width = input_shape[2]

# Resize the image
image_resized = image.resize((input_width, input_height))
image_data = np.array(image_resized, dtype=np.float32)

# Normalize the image
image_data /= 255.0

# Add the batch dimension
input_data = np.expand_dims(image_data, axis=0)

# 3. Perform inference with the model
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()

# 4. Get the model output
output_data = interpreter.get_tensor(output_details[0]['index'])
output_data = np.squeeze(output_data, axis=0)  # Remove the batch dimension
output_data = output_data.transpose(1, 0)  # Transpose the data

# Create 4 lists to store detection results for each position
position_points = [[] for _ in range(4)]
confidence_threshold = 0.7
for detection in output_data:
    point_x = detection[8]
    point_y = detection[9]

Try using

from ultralytics import YOLO

image_path = "image_path"
tflite_model_path = "model_path"
model = YOLO(tflite_model_path, task="pose")

results = model.predict(image_path)
for result in results:
    print(result.keypoints.xy)

see the working with keypoint results section in the docs for additional information.

The important thing to remember is that you can (and should always) test exported models with the YOLO class.

The normalization was probably added in this PR to fix bugs with EdgeTPU export.

So if you export YOLOv8 TFLite using the latest version, you should see relative/normalized pose coordinates even for that while using a custom inference code.

Thank you for your response. Using this method, relative coordinates are indeed output, but when implementing it in a TensorFlow environment, differences arise depending on the model version. Are there any settings on the TensorFlow side to standardize the output? If you know of any, I would appreciate your guidance.

I got it. Update to the latest version and proceed with verification. Thank you for your detailed answer.

1 Like