Skip to content

Instantly share code, notes, and snippets.

@tanaikech
Created February 28, 2023 03:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tanaikech/dfdad37859d591526b2fba8fb4390cf5 to your computer and use it in GitHub Desktop.
Save tanaikech/dfdad37859d591526b2fba8fb4390cf5 to your computer and use it in GitHub Desktop.
Resumable Download of File from Google Drive using Drive API with Python

Resumable Download of File from Google Drive using Drive API with Python

This is a sample script for achieving the resumable download of a file from Google Drive using Dive API with Python.

There might be a case in that you want to achieve the resumable download of a file from Google Drive using Dive API with Python. For example, when a large file is downloaded, the downloading might be stopped in the middle of downloading. At that time, you might want to resume the download. In this post, I would like to introduce the sample script of python.

In order to achieve the partial download from Google Drive, the property of Range: bytes=500-999 is required to be included in the request header. But, unfortunately, in the current stage, MediaIoBaseDownload cannot use this property. When MediaIoBaseDownload is used, all data is downloaded.

So, in order to achieve this goal, it is required to use a workaround. In this workaround, I proposed the following flow.

  1. Retrieve the filename and file size of the file on the Google Drive you want to download.
  2. Check the existing file by filename.
    • When there is no existing file, the file is downloaded as a new file.
    • When there is an existing file, the file is downloaded as a resumable download.
  3. Download the file content by requests.

When this flow is reflected in a sample script of python, it becomes as follows.

Sample script

service = build("drive", "v3", credentials=creds) # Here, please use your client.
file_id = "###" # Please set the file ID of the file you want to download.

access_token = creds.token # Acces token is retrieved from creds of service = build("drive", "v3", credentials=creds)

# Get the filename and file size.
obj = service.files().get(fileId=file_id, fields="name,size").execute()
filename = obj.get("name", "sampleName")
size = obj.get("size", None)
if not size:
    sys.exit("No file size.")
else:
    size = int(size)

# Check existing file.
file_path = os.path.join("./", filename) # Please set your path.
o = {}
if os.path.exists(file_path):
    o["start_byte"] = os.path.getsize(file_path)
    o["mode"] = "ab"
    o["download"] = "As resume"
else:
    o["start_byte"] = 0
    o["mode"] = "wb"
    o["download"] = "As a new file"
if o["start_byte"] == size:
    sys.exit("The download of this file has already been finished.")

# Download process
print(o["download"])
headers = {
    "Authorization": f"Bearer {access_token}",
    "Range": f'bytes={o["start_byte"]}-',
}
url = f"https://www.googleapis.com/drive/v3/files/{file_id}?alt=media"
with requests.get(url, headers=headers, stream=True) as r:
    r.raise_for_status()
    with open(file_path, o["mode"]) as f:
        for chunk in r.iter_content(chunk_size=10240):
            f.write(chunk)
  • When this script is run, a file of file_id is downloaded. When the downloaded is stopped in the middle of downloading, when you run the script again, the download is run as the resume. By this, the file content is appended to the existing file. I thought that this might be your expected situation.

Note

  • In this case, it supposes that the download file is not Google Docs files (Document, Spreadsheet, Slides, and so on). Please be careful about this.

  • This script supposes that your client service = build("drive", "v3", credentials=creds) can be used for downloading the file from Google Drive. Please be careful about this.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment