Skip to content

Instantly share code, notes, and snippets.

@scottnguyen
Last active August 29, 2015 14:17
Show Gist options
  • Save scottnguyen/905a3983016316e6177e to your computer and use it in GitHub Desktop.
Save scottnguyen/905a3983016316e6177e to your computer and use it in GitHub Desktop.
scrape.py
#!/usr/bin/env python
from sh import git,cd
import argparse
from github import Github
def main(args):
g = Github(args.username, args.password)
urls = []
for username in args.usernames:
urls = urls + [repo.clone_url for repo in g.get_user(username).get_repos()]
cd(args.directory)
for url in urls:
print 'cloning: ' + url
git.clone(url)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Scrape git repos from a specific user.')
parser.add_argument('--user', '-u', required=True, dest="usernames", action='append',
help='The user(s) you want to repo-scrape')
parser.add_argument('username', help='Your github username')
parser.add_argument('password', help='Your github password')
parser.add_argument('--directory', '-d', dest='directory', default='.', help='directory you want to save the repos in. default is current directory')
args = parser.parse_args()
main(args)
@scottnguyen
Copy link
Author

scrape.py

Download public repos on GitHub

usage: scrape.py [-h] --user USERNAMES username password

Install instructions

Step 0: Install git, python, and pip from brew or apt-get or whatever you use. If you have them already, skip this step.

brew install python git pip
sudo apt-get install git python pip -y
yum install git python pip -y

Step 1. Create a virtual environment. If you already have virtualenv and the wrapper tools, skip this step. Add sudo to the beginning of each command if your sysadmin is anal about permissions.

pip install virtualenv
pip install virtualenvwrapper
mkvirtualenv repo-scraper

Step 2. Install libs

echo "PyGithub==1.25.2\nsh==1.11\nwsgiref==0.1.2" > requirements.txt && pip install -R requirements.txt```

Step 3. Add this shit to your $PATH. Consider saving inside you .zshrc or .bashrc if you wanna use this all the time because you like smacking the shit out of github's bandwidth.

export PATH=$(pwd):$PATH

Step 4. Run it.

python scrape.py github_username github_password -u zaxiyn -u scottnguyen -u bellardia

Not tested on Windows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment