Skip to content

Instantly share code, notes, and snippets.

@ZackAkil
Created July 12, 2021 11:43
Show Gist options
  • Save ZackAkil/60e1a4015a76c3ddb06c31a9c2e60ab5 to your computer and use it in GitHub Desktop.
Save ZackAkil/60e1a4015a76c3ddb06c31a9c2e60ab5 to your computer and use it in GitHub Desktop.
Script to generate the labels csv file for AutoML Image Classification (Vertex AI) based on folder structure within google cloud storage
from google.cloud import storage
folder_location = 'YOUR GCS FOLDER LOCATION' # e.g automl-ui-dataset/x-ray-dataset
parts = folder_location.split('/')
bucket_name = parts[0]
prefix = '/'.join(parts[1:]) if len(parts) > 1 else ''
storage_client = storage.Client()
blobs = storage_client.list_blobs(bucket_name, prefix=prefix)
f = open("labels.csv", "w")
for blob in list(blobs)[1:]:
path = 'gs://' + bucket_name + '/' + blob.name
label = blob.name.split(prefix)[1].split('/')[1]
print(path, label)
f.write(path+','+label+'\n')
f.close()
print(parts, prefix)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment