Yolo Annotator Program

Dear Ultralytics community,

I would like to share a project I have been developing, called YOLO Annotator Pro. It is a desktop application intended to support the creation, annotation, and organization of datasets for YOLO-based object detection models. Although the tool is still evolving, my aim is to provide a practical and reliable environment for the full data preparation workflow, from importing images to exporting a structured dataset ready for training.

Purpose of the Tool

The motivation behind YOLO Annotator Pro is to simplify the process of building YOLO datasets by offering:

  • A straightforward interface for annotating images

  • Tools for managing classes, dataset splits, and metadata

  • Integrated frame extraction from video sources, including optional YouTube support

  • Export options compatible with YOLOv8, YOLOv5, and similar frameworks

  • Persistent project files (.yannotator) to ensure reproducibility

I am sharing this here in case it may be useful to others, and to gather feedback from people with more experience in dataset creation and annotation workflows.

Main Features

1. Image Annotation

The annotation system is designed to be simple and efficient:

  • Bounding box creation via click-and-drag

  • Cursor-centered zoom and panning for large images

  • Undo support (Ctrl+Z) and deletion via double-click

  • Class selection through numeric keys (1–9)

2. Class Management

  • Add or remove classes with customizable names and colors

  • Color‑coded bounding boxes

  • Class IDs aligned with YOLO’s standard format

3. Dataset Organization

  • Manual or automatic assignment of images to train, validation, and test splits

  • Configurable auto‑split percentages

  • Verification marking for quality control

  • Real‑time statistics on dataset composition

4. YOLO Export

The tool generates the standard YOLO directory structure:

Código

dataset/
├── train/images/   train/labels/
├── valid/images/   valid/labels/
├── test/images/    test/labels/
└── data.yaml

It also exports:

  • data.yaml ready for Ultralytics

  • classes.txt compatible with other annotation tools

  • Optional copying of images or label‑only export

5. Integrated Frame Extractor

  • Extract frames from local video files

  • Optional YouTube support via yt-dlp

  • Control over FPS, output format, JPEG quality, and minimum width

  • Automatic addition of extracted frames to the current project

6. Persistent Projects

  • Save and load projects in .yannotator format

  • Preservation of annotations, classes, splits, and progress

Recommended Workflow

Código

1. Create a new project
2. Import images or extract frames from video
3. Define classes
4. Annotate bounding boxes
5. Assign dataset splits
6. Export the dataset in YOLO format

Annotation Format

Labels follow the normalized YOLO format:

Código

<class_id> <cx> <cy> <width> <height>

All values are relative to the image dimensions.

Technologies and Dependencies

  • PyQt5 for the graphical interface

  • OpenCV for image and video processing

  • yt-dlp (optional) for YouTube support

  • PyInstaller for executable generation

Compatible with Windows, Linux, and macOS.
Requires Python 3.10 or higher.

Invitation to the Community

This project is still under active development, and I am aware that there is much room for improvement. I would be grateful for any feedback regarding usability, missing features, or potential issues. Suggestions from experienced users would be especially valuable, as the goal is to make the tool genuinely helpful for those working with YOLO datasets.

Thank you for taking the time to read this, and for the knowledge shared within this community, which has been instrumental in shaping this project.

Here is the link to the GitHub repository

I hope to develop the next version with support for creating COCO-format datasets.

Thanks for sharing this — it looks like a solid and practical tool for dataset prep.

A few things that would make it even more useful for Ultralytics YOLO users are COCO import/export, basic dataset sanity checks before export, and support beyond boxes like segmentation, pose, or OBB. I’d also suggest using val instead of valid as the default split folder name for broader convention compatibility. If you want ideas for roadmap/features, the Ultralytics Platform annotation workflow and the Platform overview are good references for smart annotation, review, and dataset management.

Really nice start overall — COCO support sounds like a great next step.

Nice work. The local project file and frame extraction flow are especially useful because many YOLO dataset issues start before training, when people accidentally mix frames, labels and splits.

A few features I would consider adding if you keep developing it:

  1. A label consistency check before export, for example empty labels, boxes outside image bounds, duplicate boxes, invalid class ids and missing images.
  2. A way to visually review one class at a time across the dataset.
  3. A quick “show me the smallest boxes” view, because small objects often fail after resizing.
  4. Optional segmentation masks, even if the final export is boxes only, because masks make review much easier for irregular objects.
  5. Augmentation preview before export, so users can see whether rotations, blur or colour changes are becoming unrealistic.
  6. A small validation preview that renders a random sample of exported YOLO labels back onto images.