Skip to content

Instantly share code, notes, and snippets.

@davidefiocco
Created February 16, 2019 23:30
Show Gist options
  • Save davidefiocco/7721a69589a1221fbb16a4b664a1315a to your computer and use it in GitHub Desktop.
Save davidefiocco/7721a69589a1221fbb16a4b664a1315a to your computer and use it in GitHub Desktop.
Reorganize image files with imagenet style folder hierarchy
import numpy as np
import shutil
from sklearn.model_selection import train_test_split
cats = ['negative', 'positives']
for cat in cats:
print(cat)
if not os.path.exists(data_folder + "/train/" + cat):
os.makedirs(data_folder + "/train/" + cat)
if not os.path.exists(data_folder + "/test/" + cat):
os.makedirs(data_folder + "/test/" + cat)
images = os.listdir(data_folder + "/" + cat)
# assume that the images have a name like 23431234_Figure2.jpg, do train and test splitting
train_ids, test_ids = train_test_split(list(set([name.split("_")[0] for name in images])))
for image in images:
if image.split("_")[0] in train_ids:
shutil.copy(data_folder + "/" + cat + "/" + image, data_folder + "/train/" + cat + "/" + image)
elif image.split("_")[0] in test_ids:
shutil.copy(data_folder + "/" + cat + "/" + image, data_folder + "/test/" + cat + "/" + image)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment