Skip to content

Instantly share code, notes, and snippets.

@antonyharfield
Created November 23, 2019 14:25
Show Gist options
  • Save antonyharfield/ff9d47e96bb9a9491bad7df543746a1b to your computer and use it in GitHub Desktop.
Save antonyharfield/ff9d47e96bb9a9491bad7df543746a1b to your computer and use it in GitHub Desktop.
Google Cloud Storage list the directories or folders within a given bucket and path (not including objects!)
from google.api_core import page_iterator
from google.cloud import storage
def _item_to_value(iterator, item):
return item
def list_directories(bucket_name, path):
if not path.endswith('/'):
path += '/'
extra_params = {
"projection": "noAcl",
"prefix": path,
"delimiter": '/'
}
gcs = storage.Client()
path = "/b/" + bucket_name + "/o"
iterator = page_iterator.HTTPIterator(
client=gcs,
api_request=gcs._connection.api_request,
path=path,
items_key='prefixes',
item_to_value=_item_to_value,
extra_params=extra_params,
)
return [x for x in iterator]
bucket_name = 'rfcx-models-dev'
path = 'dog-bark/datasets/v2/train_set/'
print(list_directories(bucket_name, path))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment