Skip to content

Instantly share code, notes, and snippets.

@simonw
Created April 3, 2019 23:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simonw/7fd9922619b06f95348c7233756db8ca to your computer and use it in GitHub Desktop.
Save simonw/7fd9922619b06f95348c7233756db8ca to your computer and use it in GitHub Desktop.
Fetch metadata from Google Drive API for a list of doc_ids (because their batch API is extremely difficult to figure out)
def fetch_metadata_for_doc_ids(doc_ids, oauth_token):
boundary = 'batch_boundary'
headers = {
'Authorization': 'Bearer {}'.format(oauth_token),
'Content-Type': 'multipart/mixed; boundary=%s' % boundary,
}
body = ''
for doc_id in doc_ids:
req = 'GET https://www.googleapis.com/drive/v3/files/{}?fields=*'.format(doc_id)
body += '--%s\n' % boundary
body += 'Content-Type: application/http\n\n'
body += '%s\n\n' % req
body += '--%s--' % boundary
response = requests.post(
'https://www.googleapis.com/batch/drive/v3',
data=body.encode(encoding='utf-8'),
headers=headers
)
response_boundary = response.headers["Content-Type"].split(" boundary=")[1]
chunks = response.content.split(response_boundary.encode("utf8"))[1:-1]
# Each chunk should correspond to an incoming doc_id
metadata_by_id = {}
for doc_id, chunk in zip(doc_ids, chunks):
_, http_headers, body = chunk.rsplit(b"\r\n", 1)[0].split(b"\r\n\r\n", 3)
metadata = json.loads(body.decode("utf8"))
metadata_by_id[doc_id] = metadata
return metadata_by_id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment