Skip to content

Instantly share code, notes, and snippets.

@ivder
Last active April 11, 2019 05:28
Show Gist options
  • Save ivder/3a35fe8d1cc28209366019f5b72fd6fb to your computer and use it in GitHub Desktop.
Save ivder/3a35fe8d1cc28209366019f5b72fd6fb to your computer and use it in GitHub Desktop.
program to create CSV file for google automl training data
import os
import pandas as pd
data_folders = next(os.walk('.'))[1]
filenames = [os.listdir(f) for f in data_folders]
files_dict = dict(zip(data_folders, filenames))
base_gcs_path = 'gs://sunlit-cove-237107-vcm/Damage/'
data_array = []
for (dict_key, files_list) in files_dict.items():
for filename in files_list:
if '.jpg' not in filename:
continue # don't include non-photos
label = dict_key
data_array.append((base_gcs_path + dict_key + '/' + filename , label))
#print data_array
dataframe = pd.DataFrame(data_array)
dataframe.to_csv('damage_data.csv', index=False, header=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment