Skip to content

Instantly share code, notes, and snippets.

@vdalv
Created June 23, 2018 21:52
Show Gist options
  • Save vdalv/7aaf8a090a6111184d6b6438a6847669 to your computer and use it in GitHub Desktop.
Save vdalv/7aaf8a090a6111184d6b6438a6847669 to your computer and use it in GitHub Desktop.
This script reads PascalVOC xml files, and crops the class instances into seperate image files.

Export Class Images from PascalVOC Annotations

This script reads PascalVOC xml files, and crops the class instances into seperate image files.

Note: This script relies on ImageMagick for the crop (convert) functionality. It comes pre-installed on most recent versions of Ubuntu.

Disclaimer: This code is a modified version of Dat Tran's xml_to_csv.py

Example:

Let's say we have this image in our train folder:

And we annotate it like so:

Upon running the script, our train folder now looks like this:

And the class folders (head, leg) contain the following:

crop_classes.py:

import os
import glob
import subprocess
import xml.etree.ElementTree as ET

classes_dict = {}

def crop_classes(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            class_name = member[0].text
            save_folder = path + "/" + class_name
            if not os.path.isdir(save_folder):
                os.makedirs(save_folder)
        
            if not class_name in classes_dict.keys():
                classes_dict[class_name] = 1;

            #http://www.imagemagick.org/Usage/crop/
            #convert -crop {x2 - x1}x{y2 - y1}+{x1}+{y1} {folder_name}/{image_name} {folder_name}/{class_name}/{class_name}-{count}-{image_name}
            call = ['convert', '-crop', str(value[4] - value[2]) + 'x' + str(value[5] - value[3]) + "+" + str(value[2]) + "+" + str(value[3]), path + "/" + value[0], save_folder + "/" + value[1] + "-" + str(classes_dict[class_name]) + "-" + value[0]]
            subprocess.call(call)

            classes_dict[class_name] += 1

def main():
    for folder_name in ['train', 'val']:
        image_path = os.path.join(os.getcwd(), folder_name)
        crop_classes(image_path)
        print('Successfully cropped classes in the ' + folder_name + ' folder.')

main()
@vdalv
Copy link
Author

vdalv commented Jun 26, 2018

Yeah, no problem, man.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment