Skip to content

Instantly share code, notes, and snippets.

@zrruziev
Last active March 14, 2023 10:27
Show Gist options
  • Save zrruziev/5e5ec8d8af0d2fbbf5ee8b33cb905be6 to your computer and use it in GitHub Desktop.
Save zrruziev/5e5ec8d8af0d2fbbf5ee8b33cb905be6 to your computer and use it in GitHub Desktop.
`yolov5` WARNING ⚠️ : duplicate labels removed; ignoring corrupt image/label: negative label values; ignoring corrupt image/label: non-normalized or out of bounds coordinates

Yolo label correction

While I was training yolov5 with my custom dataset I faced with these WARNINGS ⚠️:

WARNING ⚠️ /path/to/imgname*.jpg: 1 duplicate labels removed
WARNING ⚠️ /path/to/imgname*.jpg: ignoring corrupt image/label: negative label values [ -0.0069444]
WARNING ⚠️ /path/to/imgname*.jpg: ignoring corrupt image/label: non-normalized or out of bounds coordinates [     1.0104]

It happened because of ultralytics/yolov5#857 (comment) and ultralytics/yolov5#10149


Then, I just make a label_corrector.py python script that fixes custom dataset's label*.txt files:

import os
import argparse
from tqdm import tqdm

def correct_bbox(bbox):
    x_center, y_center, width, height = bbox
    x_center_corr = max(min(x_center, 1), 0)
    y_center_corr = max(min(y_center, 1), 0)
    width_corr = max(min(width, 1), 0)
    height_corr = max(min(height, 1), 0)

    return x_center_corr, y_center_corr, width_corr, height_corr

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--path', help='path to folder containing label files')
    args = parser.parse_args()

    root_path = os.path.abspath(args.path)
    label_files = os.listdir(root_path)

    for filename in tqdm(label_files, desc='label_correction'):
        if not filename.endswith('.txt'):
            continue

        filepath = os.path.join(root_path, filename)

        with open(filepath, 'r') as f:
            lines = f.readlines()

        # Use a set to store unique lines
        unique_lines = set()

        # Add each line to the set, ignoring empty lines
        for line in lines:
            line = line.strip()
            if line:
                class_index, x_center, y_center, width, height = map(float, line.split())
                bbox = [x_center, y_center, width, height]
                x_center_corr, y_center_corr, width_corr, height_corr = correct_bbox(bbox)
                if bbox != [x_center_corr, y_center_corr, width_corr, height_corr]:
                    print(f"\nFile: {filename}, Original label: {line}")
                    print(f"File: {filename}, Corrected label: {int(class_index)} {x_center_corr:.6f} {y_center_corr:.6f} {width_corr:.6f} {height_corr:.6f}")
                unique_lines.add(f"{int(class_index)} {x_center_corr:.6f} {y_center_corr:.6f} {width_corr:.6f} {height_corr:.6f}")

        with open(filepath, 'w') as f:
            # Write the unique lines back to the file
            for line in unique_lines:
                f.write(line + '\n')

if __name__ == '__main__':
    main()

You can run that script like this:

python label_corrector.py --path /path/to/yolo/labels/

labels folder look like this:

├──labels
   ├──label1.txt
   ├──label2.txt
   |  ...

label*.txt file look like this:

4 0.184028 0.354167 0.256944 0.180556
4 0.121528 0.930556 0.354167 0.125000
4 0.796875 0.123264 0.190972 0.142361


Terminal output will look like these:

...
File: bar205.txt, Original label: 4 0.46527777777777773 -0.003472222222222222 0.09027777777777778 0.24305555555555555
File: bar205.txt, Corrected label: 4 0.465278 0.000000 0.090278 0.243056
label_correction:   2%|██▎                                                                                                  | 340/19035 [00:06<05:37, 55.40it/s]
File: dark_bg_153.txt, Original label: 4 0.2795138888888889 -0.03819444444444444 0.17013888888888887 0.08333333333333333
File: dark_bg_153.txt, Corrected label: 4 0.279514 0.000000 0.170139 0.083333
label_correction:   4%|█████▍                                                                                               | 811/19035 [00:14<05:10, 58.61it/s]
File: deli60.txt, Original label: 4 0.8628472222222222 -0.001736111111111111 0.15625 0.19791666666666666
File: deli60.txt, Corrected label: 4 0.862847 0.000000 0.156250 0.197917
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment