Skip to content

Instantly share code, notes, and snippets.

@t-eckert
Created January 4, 2022 04:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save t-eckert/4dc7f623a2f10b87b597f801578dccc6 to your computer and use it in GitHub Desktop.
Save t-eckert/4dc7f623a2f10b87b597f801578dccc6 to your computer and use it in GitHub Desktop.
Archivist: A script for downloading all of your GitHub repositories

Archivist

My buddy Mikhail wanted to download all of his GitHub repositories for safekeeping. I wrote this script for him to do that.

How to use it

Copy the archivist.py file to your computer.

Create a virtual environment and activate it.

python3 -m venv .venv
.venv/bin/activate

Install the dependencies. It's just one.

pip install httpx

Create an OAuth access token for GitHub.

Change directory to where you want to put your archives. Call the script from there, passing in the access token.

python3 archivist.py <GITHUB-ACCESS-TOKEN>
"""Gist to help Mikhail download all the files from a GitHub archive."""
import argparse
import httpx
parser = argparse.ArgumentParser(
description="Download all files from a GitHub archive."
)
parser.add_argument("token", type=str, help="GitHub access token")
args = parser.parse_args()
gh_token = args.token
def collect_urls(api: str) -> list[tuple[str, str]]:
# Call API and continue calling until all pages are exhausted
response = httpx.get(
api,
headers={"Authorization": f"token {gh_token}"},
)
if not response.is_success:
print(response.json())
exit(1)
archive_urls = [
(repo["name"], repo["archive_url"].replace("{archive_format}{/ref}", "tarball"))
for repo in response.json()
]
if response.links.get("next"):
archive_urls.extend(collect_urls(response.links["next"]["url"]))
return archive_urls
print("Collecting URLs...")
archive_urls = collect_urls("https://api.github.com/user/repos")
print(f"Let's download {len(archive_urls)} repos!")
for url in archive_urls:
print("Downloading", url[0])
with httpx.stream("GET", url[1], follow_redirects=True) as r:
with open(f"{url[0]}.tar.gz", "wb") as f:
for b in r.iter_bytes():
f.write(b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment