zrruziev/coco2yolo.md

## coco2yolo.md

      
    Raw
  

              coco2yolo.md
            
          
    coco_to_yolo convertor

WARNING ⚠️ /path/to/imgname*.jpg: 1 duplicate labels removed
WARNING ⚠️ /path/to/imgname*.jpg: ignoring corrupt image/label: negative label values [ -0.0069444]
WARNING ⚠️ /path/to/imgname*.jpg: ignoring corrupt image/label: non-normalized or out of bounds coordinates [     1.0104]
When I converted coco_format_labels(json file) to yolo format(*.txt files) by using some libraries from github,

and tried to train yolov5 there were some warnings(shown above) which made training slow and ignored some data.

Then I created coco2yolo.py to fix those problems while converting:
import json
import os
import argparse
from tqdm import tqdm


def correct_bbox(bbox):
    x_center, y_center, width, height = bbox
    x_center_corr = max(min(x_center, 1), 0)
    y_center_corr = max(min(y_center, 1), 0)
    width_corr = max(min(width, 1), 0)
    height_corr = max(min(height, 1), 0)
    return x_center_corr, y_center_corr, width_corr, height_corr

def coco2yolo(coco_path, output_path):
    # Load COCO annotations
    with open(coco_path, 'r') as f:
        coco = json.load(f)

    # Create dictionary to map category IDs to names
    categories = {}
    for category in coco['categories']:
        categories[category['id']] = category['name']

    # Create directory to store YOLO format annotations
    os.makedirs(output_path, exist_ok=True)

    # Convert COCO annotations to YOLO format
    for image in tqdm(coco['images'], desc='converting...'):
        # Create YOLO format annotation file
        filename = os.path.splitext(image['file_name'])[0] + '.txt'
        filepath = os.path.join(output_path, filename)
        # Use a set to store unique lines
        unique_lines = set()
        with open(filepath, 'w') as f:
            for annotation in coco['annotations']:
                if annotation['image_id'] == image['id']:
                    class_index = int(annotation['category_id'])
                    # Convert bounding box coordinates to YOLO format
                    x, y, w, h = annotation['bbox']
                    x_center = x + w / 2
                    y_center = y + h / 2
                    x_center /= image['width']
                    y_center /= image['height']
                    w /= image['width']
                    h /= image['height']
                    bbox = [x_center, y_center, w, h]
                    x_center_corr, y_center_corr, width_corr, height_corr = correct_bbox(bbox)
                    line = '{} {:.6f} {:.6f} {:.6f} {:.6f}'.format(
                        class_index, x_center_corr, y_center_corr, width_corr, height_corr)
                    if bbox != [x_center_corr, y_center_corr, width_corr, height_corr]:
                        org_label = (class_index, x_center, y_center, w, h)
                        print(f"\nFile: {filename}, Original label: {org_label}")
                        print(f"File: {filename}, Corrected label: {line}")
                    unique_lines.add(line)
            # Write the unique lines back to the file
            for line in unique_lines:
                f.write(line + '\n')


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--coco_path', help='path to COCO annotations file')
    parser.add_argument('--output_path', help='path to output directory for YOLO annotations')
    args = parser.parse_args()

    coco2yolo(args.coco_path, args.output_path)
Example:
python coco2yolo.py --coco_path /path/to/coco.json --output_path /path/to/output/folder/