False positives after converting YOLOv8 to .tflite

Hi everybody! New to ultralytics, but so far it’s been an amazing tool. Thank you to the team behind the YOLO models!

Some context: we are trying to improve the object detection in our react-native app, which is using react-native-fast-tflite to load and run our model. My question isn’t about react-native specifically, but using a tflite file is a requirement.

To convert a YOLO model to .tflite, I have a simple python script that downloads yolov8s and converts it to a .tflite format, which we then just drop into our react-native app:

from ultralytics import YOLO
model = YOLO("yolov8s")
model.export(format="tflite")

When running our model, we just get completely random detection boxes. I’ve reproduced what we’re seeing in a codepen here, using tfjs-tflite.

I believe we’re doing everything correctly… I’ve poured over every piece of documentation, relevant github issues, and discussion forums. Our process of interpreting the output seems correct, but I must be missing something. The codepen is full of comments explaining what we’re doing and why, but at a high level, this is what we do in the codepen (which mimics what we do in the react-native javascript logic):

  1. Load model using tflite.loadTFLiteModel
  2. Prepare input tensor data using tf.browser.fromPixels(imageElement)
  3. Normalize pixel values to [-1, 1] using const input = tf.sub(tf.div(tf.expandDims(tensor), 127.5), 1) (note that we have tried [0, 1] with similarly inaccurate results)
  4. Run model prediction using tfliteModel.predict(input), and grab result using await outputTensor.data()
  5. Parse the results and store in a “boxes” array (this might be where we messed things up…?)
  6. Run NMS on the “boxes” array, and draw the results to a canvas

Any help would be hugely appreciated. Our team is building a tool to help tradespeople identify tools and materials on a job site, and find the nearest distributor that supplies these tools. Thank you so much!

Drew

Hi Drew! :blush:

It’s great to hear you’re finding Ultralytics helpful! Let’s see if we can resolve the issue with your .tflite model.

Firstly, ensure that the model conversion process is correctly configured. When exporting to .tflite, make sure you’re using the correct image size and input normalization that matches the training configuration. Here’s a quick checklist:

  1. Image Size: Ensure the input image size during inference matches the size used during training. You can specify this during export with imgsz.

  2. Normalization: YOLO models typically expect pixel values in the range [0, 1]. If you’re normalizing to [-1, 1], it might cause issues. Try sticking with [0, 1].

  3. Output Parsing: Double-check the parsing logic for the model’s output. Ensure the dimensions and indices align with the expected output format of YOLO models.

  4. Non-Maximum Suppression (NMS): Ensure your NMS implementation is correctly configured to filter out overlapping boxes.

Here’s a refined export example:

from ultralytics import YOLO

model = YOLO("yolov8s")
model.export(format="tflite", imgsz=640)  # Ensure this matches your training image size

For more detailed guidance, you might find the YOLOv8 Export Documentation helpful.

If the issue persists, consider testing the .tflite model in a simple Python environment using TensorFlow Lite to isolate whether the problem is with the model or the integration with React Native.

Feel free to reach out if you have more questions. Best of luck with your project—it’s a fantastic initiative! :rocket:

@drewandre Did you find the solution and the fix here? I am having a similar problem right now. While using yolo it detects everything ok, but after converting to tflite and use with python or react native it’s not detecting correclty anymore. Looks like the problem that you had. If you found the solution please share with me. Thank you very much.

They OP updated the codepen link they shared with a working solution.

I’ve pasted the JavaScript code from URL (incase it’s unavailable for any reason), expand the details to see the code.

const labels = [
  // list of class names (removed for brevity)
];

const numClasses = labels.length;

const IMAGE_SIZE = 640;

tflite.setWasmPath(
  "https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tflite@0.0.1-alpha.10/wasm/"
);

// Helper function to draw bounding boxes on the canvas.
function drawBoxes(boxes_data, scores_data, classes_data) {
  const canvas1 = document.getElementById("canvas-overlay");
  const ctx = canvas1.getContext("2d");
  ctx.clearRect(0, 0, canvas1.width, canvas1.height);

  // font configs
  const font = `${Math.max(
    Math.round(Math.max(ctx.canvas.width, ctx.canvas.height) / 40),
    14
  )}px Arial`;
  ctx.font = font;
  ctx.textBaseline = "top";

  for (let i = 0; i < scores_data.length; ++i) {
    // filter based on class threshold
    const klass = labels[classes_data[i]];
    const color = "#f00";
    const score = (scores_data[i] * 100).toFixed(1);

    let [y1, x1, y2, x2] = boxes_data.slice(i * 4, (i + 1) * 4);
    x1 *= IMAGE_SIZE;
    x2 *= IMAGE_SIZE;
    y1 *= IMAGE_SIZE;
    y2 *= IMAGE_SIZE;
    const width = x2 - x1;
    const height = y2 - y1;

    // draw box.
    ctx.strokeStyle = "#f00";
    ctx.fillStyle = "transparent";
    ctx.fillRect(x1, y1, width, height);

    // draw border box.
    ctx.strokeStyle = color;
    ctx.lineWidth = 2;
    ctx.strokeRect(x1, y1, width, height);

    // Draw the label background.
    const textWidth = ctx.measureText(klass + " - " + score + "%").width;
    const textHeight = parseInt(font, 10); // base 10
    const yText = y1 - (textHeight + ctx.lineWidth);
    ctx.fillRect(
      x1 - 1,
      yText < 0 ? 0 : yText, // handle overflow label box
      textWidth + ctx.lineWidth,
      textHeight + ctx.lineWidth
    );

    // Draw labels
    ctx.fillStyle = "#ffffff";
    ctx.fillText(klass + " - " + score + "%", x1 - 1, yText < 0 ? 0 : yText);
  }
}

async function start() {
  try {
    // Load .tflite model from personal portfolio, mimicing how we import the model
    // from our react-native app's assets folder.
    // According to netron, this model has an input tensor shape if [1, 640, 640, 3],
    // and an output tensor shape of [1, 84, 8400]
    const tfliteModel = await tflite.loadTFLiteModel(
      "https://drewjamesandre.com/.well-known/yolov8s-oiv7_float32.tflite"
    );
    console.log("Model loaded!");

    // Prepare input tensor data from image
    const imageElement = document.getElementById("original-image");
    const tensor = tf.browser.fromPixels(imageElement);

    // The following block of code simply draws the model's input image
    // just for debugging purposes, really (in case our issue is due to
    // some malformed image that is passed to the model)
    const canvas = document.getElementById("canvas-overlay");
    tf.browser.draw(tensor, canvas);
    const dataUrl = canvas.toDataURL();
    const img = document.getElementById("model-image");
    img.src = dataUrl;

    // Convert the pixel values to [0, 1].
    const input = tf.div(tf.expandDims(tensor), 255);

    // Run the model prediction, and save the resulting tensor output data
    const outputTensor = tfliteModel.predict(input);

    // transpose result [b, det, n] => [b, n, det]
    const transRes = outputTensor.transpose([0, 2, 1]);

    // All of the box processing is borrowed from https://github.com/Hyuto/yolov8-tfjs/blob/master/src/utils/detect.js
    const boxes = tf.tidy(() => {
      const w = transRes.slice([0, 0, 2], [-1, -1, 1]); // get width
      const h = transRes.slice([0, 0, 3], [-1, -1, 1]); // get height
      const x1 = tf.sub(transRes.slice([0, 0, 0], [-1, -1, 1]), tf.div(w, 2)); // x1
      const y1 = tf.sub(transRes.slice([0, 0, 1], [-1, -1, 1]), tf.div(h, 2)); // y1
      return tf
        .concat(
          [
            y1,
            x1,
            tf.add(y1, h), // y2
            tf.add(x1, w) // x2
          ],
          2
        )
        .squeeze();
    }); // process boxes [y1, x1, y2, x2]

    const [scores, classes] = tf.tidy(() => {
      // class scores
      const rawScores = transRes
        .slice([0, 0, 4], [-1, -1, numClasses])
        .squeeze(0); // #6 only squeeze axis 0 to handle only 1 class models
      return [rawScores.max(1), rawScores.argMax(1)];
    }); // get max scores and classes index

    const nms = await tf.image.nonMaxSuppressionAsync(
      boxes,
      scores,
      20,
      0.5,
      0.2
    ); // NMS to filter boxes

    const boxes_data = boxes.gather(nms, 0).dataSync(); // indexing boxes by nms index
    const scores_data = scores.gather(nms, 0).dataSync(); // indexing scores by nms index
    const classes_data = classes.gather(nms, 0).dataSync(); // indexing classes by nms index

    // Draw boxes to the existing canvas, which has a background image containing the
    // image supplied to the model
    drawBoxes(boxes_data, scores_data, classes_data);

    // Clean up tensors.
    tf.dispose([outputTensor, transRes, boxes, scores, classes, nms]); // clear memory

    // Remove loading state
    document.getElementById("loading-text").remove();
  } catch (error) {
    console.warn("Failed to predict:", error.message);
  }
}

tf.setBackend("cpu")
  .then(() => start())
  .catch(console.warn);

@BurhanQ Thank you for your quick answer. Using this solution in browser I was able to make it work in browser, but when using inside the react-native with react-native-fast-tflite is still wrongly detected. So this means the model itself it’s ok, as in browser is working ok, but the implementation of react-native-fast-tflite is wrong on my side.

@drewandre If you have an example with your react native implementation that would help a lot. Thank you.

Here is my code that I have now:

const objectDetection = useTensorflowModel(require('../../assets/ai/aimodel.tflite'));
  const model = objectDetection.state === 'loaded' ? objectDetection.model : undefined;
  const [foundObjects, setFoundObjects] = useState([]);

  const convertSnapshotToTensor = async () => {
    const imageUri = Image.resolveAssetSource(exampleImage).uri
    
    const convertedArray = await convertToRGB(imageUri);
    return new Float32Array(convertedArray);
  };

  const processImage = async () => {

      // 1. Resize the image if needed
      // 1. Convert the snapshot to the expected format
      console.log('Start convert snapshot to tensor');
      const inputTensor = await convertSnapshotToTensor();
      console.log('AI inputTensor:', inputTensor.length); // 1228800

      const outputs = await model.run([inputTensor]);
      const detectedOut = outputs[0];
      console.log('AI detectedOut:', typeof detectedOut, detectedOut.length); //470400
      

    const parsedYoloModels = parseYOLOOutput(detectedOut, 0.5, 0.4);
    }

Testing the model in netron app this are the input and output:

Input:

name: **images**
tensor: **`float32[1,640,640,3]`**
denotation: **`Image(RGB)`**
Input image to be detected.
identifier: **0**

Output:

name: **Identity**
tensor: **`float32[1,56,8400]`**
Coordinates of detected objects, class labels, and confidence score
identifier: **407**

I don’t have any errors, just the results are wrong. What I am doing wrong? I am debuging this for a few days and I can still not found out the issue.

Thank you.

Apologies, I’m not really familiar with Javascript, so I won’t be much help here. That said, the code snippet you shared seems to reference objects/functions that aren’t defined, so it would be a bit challenging to diagnose (assuming those aren’t well known objects/functions, I just really don’t know :sweat_smile:).

If you haven’t searched on GitHub already, I would recommend looking for other projects that have used react-native -fast-tflite with an Ultralytics YOLO converted model. It might also be worth trying to debug as many steps in your code to view the image/tensor-shape and compare against the in-browser code to find where they diverge. I’m also wondering if there might be a difference in compatibility or functionality when using react-native versus the in-browser JS.