Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
A simple script to backup an organization's GitHub repositories, wikis and issues.
# A simple script to backup an organization's GitHub repositories.
# NOTE: if you have more than 100 repositories, you'll need to step thru the list of repos
# returned by GitHub one page at a time, as described at
GHBU_BACKUP_DIR=${GHBU_BACKUP_DIR-"github-backups"} # where to place the backup files
GHBU_ORG=${GHBU_ORG-"<CHANGE-ME>"} # the GitHub organization whose repos will be backed up
# (if you're backing up a user's repos instead, this should be your GitHub username)
GHBU_UNAME=${GHBU_UNAME-"<CHANGE-ME>"} # the username of a GitHub account (to use with the GitHub API)
GHBU_PASSWD=${GHBU_PASSWD-"<CHANGE-ME>"} # the password for that account
GHBU_GITHOST=${GHBU_GITHOST-""} # the GitHub hostname (see comments)
GHBU_PRUNE_OLD=${GHBU_PRUNE_OLD-true} # when `true`, old backups will be deleted
GHBU_PRUNE_AFTER_N_DAYS=${GHBU_PRUNE_AFTER_N_DAYS-3} # the min age (in days) of backup files to delete
GHBU_SILENT=${GHBU_SILENT-false} # when `true`, only show error messages
GHBU_API=${GHBU_API-""} # base URI for the GitHub API
GHBU_GIT_CLONE_CMD="git clone --quiet --mirror git@${GHBU_GITHOST}:" # base command to use to clone GitHub repos
TSTAMP=`date "+%Y%m%d-%H%M"`
# The function `check` will exit the script if the given command fails.
function check {
if [ $status -ne 0 ]; then
echo "ERROR: Encountered error (${status}) while running the following:" >&2
echo " $@" >&2
echo " (at line ${BASH_LINENO[0]} of file $0.)" >&2
echo " Aborting." >&2
exit $status
# The function `tgz` will create a gzipped tar archive of the specified file ($1) and then remove the original
function tgz {
check tar zcf $1.tar.gz $1 && check rm -rf $1
$GHBU_SILENT || (echo "" && echo "=== INITIALIZING ===" && echo "")
$GHBU_SILENT || echo "Using backup directory $GHBU_BACKUP_DIR"
check mkdir -p $GHBU_BACKUP_DIR
$GHBU_SILENT || echo -n "Fetching list of repositories for ${GHBU_ORG}..."
REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?per_page=100 -q | check grep "\"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'`
# NOTE: if you're backing up a *user's* repos, not an organizations, use this instead:
# REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/user/repos -q | check grep "\"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'`
$GHBU_SILENT || echo "found `echo $REPOLIST | wc -w` repositories."
$GHBU_SILENT || (echo "" && echo "=== BACKING UP ===" && echo "")
for REPO in $REPOLIST; do
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}"
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO}.wiki (if any)"
${GHBU_GIT_CLONE_CMD}${GHBU_ORG}/${REPO}.wiki.git ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git 2>/dev/null && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.wiki-${TSTAMP}.git
$GHBU_SILENT || echo "Backing up ${GHBU_ORG}/${REPO} issues"
check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/repos/${GHBU_ORG}/${REPO}/issues -q > ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP} && tgz ${GHBU_BACKUP_DIR}/${GHBU_ORG}-${REPO}.issues-${TSTAMP}
if $GHBU_PRUNE_OLD; then
$GHBU_SILENT || (echo "" && echo "=== PRUNING ===" && echo "")
$GHBU_SILENT || echo "Pruning backup files ${GHBU_PRUNE_AFTER_N_DAYS} days old or older."
$GHBU_SILENT || echo "Found `find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS | wc -l` files to prune."
find $GHBU_BACKUP_DIR -name '*.tar.gz' -mtime +$GHBU_PRUNE_AFTER_N_DAYS -exec rm -fv {} > /dev/null \;
$GHBU_SILENT || (echo "" && echo "=== DONE ===" && echo "")
$GHBU_SILENT || (echo "GitHub backup completed." && echo "")

marinho commented Sep 9, 2013

Hi, well done, this is going to be useful for me :)

Can you just tell me where are the notes about the GitHub's hostname, please?

Thank you!


rodw commented Oct 10, 2013

Sorry @marinho. That's a little cut-and-paste error that referenced a private wiki.

In the general case you can just use as the host name.

The note that is referenced describes a way to run this backup script under a different set of credentials than one's normal GitHub account. Here's the relevant snippet:

If you want to use more than one GitHub account (e.g., your own account as well as the read-only back-up account), add the following to ~/.ssh/config (creating that file if needed):

    PreferredAuthentications publickey
    IdentityFile ~/.ssh/<THE-BACKUP-SSH-KEY>

(Where <BACKUP> is an arbitrary host name, but the same as the value used in the script and ~/.ssh/<THE-BACKUP-SSH-KEY> an ssh key generated with ssh-keygen and uploaded to GitHub.)

You can then login via:

ssh-add ~/.ssh/<THE-BACKUP-SSH-KEY>

and run the backup script as a cron job.

Calrion commented Oct 14, 2013

Firstly, thanks for this, it's a great help!

For those who, like me, want to backup user repositories rather than organisation repositories, the following small changes are required:

  • Enter your GitHub username as the value of both GHBU_UNAME and GHBU_ORG.

  • Change line 41 to read:

    REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/user/repos -q | check grep "\"name\"" | check awk -F': "' '{print $2}' | check sed -e 's/",//g'`

Even though you remove the GHBU_ORG reference from that line, it's used later on to compute the full repository path so it's still needed (and it needs to be your username, as above).

With those changes, this script grabbed all my repositories—public, private, and forked—and made backups of the repository, the wiki, and the issues. Great work! 😄

bjtitus commented Feb 6, 2014

I made a few changes to support the paginated repos list since we have more than 100 repositories (the maximum allowed in a single page of the API)

reggi commented Feb 18, 2014

Took me a long while to realize that I just wanted GHBU_GITHOST=${GHBU_GITHOST-""} which should be the default!! >.<


rodw commented Mar 13, 2014

@reggl: Good call. I made that change


rodw commented Mar 13, 2014

@Calrion Thanks, I added comments describing your changes.

One could probably parameterize the script a bit to support both without "manual" intervention.

I need to put -i flag to be able my password be accepted, I don't know why.

Secondly, what to do with the files downloaded, is it the content of .git directory ?
How do github deal with repo, issu, wiki. Is it some branches ?
How to restore the files instead of a bare repo ?

At least I've found why my password wasn't working as it is containing a special char, that needed to be escaped with \

Finally, I've adopted anonother solution, as it was convenient for me to move my issues to bitbucket.
So if you want to do so, instead of just backuping github.
Here is 3 links

HI Am getting error in line 43. Could you please help me on this ?

ERROR: Encountered error (1) while running the following:
grep "name"
(at line 43 of file

I too am getting the same error as @railsfactory-suriya " ./


Using backup directory github-backups
Fetching list of repositories for BluTrumpetOrg...ERROR: Encountered error (1) while running the following:
grep "name"
(at line 43 of file ./
found 0 repositories.
Please advise...

The "issues" backup is only the list of issues, not the content. I think you need something more sophisticated to traverse all the *_url entries for each comment, event, etc.

magikid commented Sep 18, 2014

Thanks for writing this script!

I just wish that it worked with 2-factor auth.

mtolly commented Nov 16, 2014

The script breaks if you are a user who has access to another user's repository. For example if you are user A but you are a contributor to another user B's somerepo, the script will mistakenly try to download A/somerepo. This could be fixed by using the full_name instead of the name.

It only pulls open issues. Can be easily fixed. See here:

But it doesn't seem to pull all issues. I cannot figure why. Any ideas?

@magikid The script does work with 2FA, you will need to generate an application-specific password for the script.

mandric commented Mar 3, 2015

If your org has more than 30 repos you will probably want to add a ?per_page=100 arg to get the entire list, otherwise it seems github API defaults to 30 repos per page.

@blutrumpet @railsfactory-suriya You will get the grep "name" error if your GHBU_API string is incorrect, or has a trailing slash.

Along with ?per_page=100, if you have more than 100 repos, you need to add &page=N in order to grab them all.

However, you can only call 100 repos at a time from github, so you need a loop to grab different pages if you have more than 100 repos.

I forked this gist and added an until loop replacing line 43-68 of this script, which you can see here. Useful if you have more than 100 repos:

I have more than100 repository in my organization. but script fetching 30 repository
Could you please help us to resolve this issue.

Thank you

Has anyone looked at importing issues/wiki back into github after they've been exported?

jok3ll commented Sep 5, 2015

Please me slot me expert in to jb name first jok3ll please slot

thekeith commented Sep 6, 2015


You need to change the variables in the script on lines 5, 7 and 8 that are noted as :)

Create a personal access token (in settings) to use as a password if you have 2-step auth enabled

This worked great! Thanks!

Hello GitHub User Community, we have a large software organization and have 85% of our source code within GitHub. We perform daily backups using the GitHub backup utility and usually completes in 3-4 hours. Can anyone recommend a backup solution to achieve zero or close to zero data loss, for example, a backup solution that can perform continuous backup. Note, we do have a disaster recovery solution in place but its a backend (SAN) Storage Replication solution but if someone deletes the contents, these changes are replicated to our target. We could investigate SAN Storage Snapshots as a solution. I like to hear what other GitHub Admins are doing for local backup and recovery.

cchorn commented Dec 25, 2015

This was working for a very long time but now the script breaks at line 45 ...

(at line 45 of file
ERROR: Encountered error (1) while running the following:
grep "name"
(at line 45 of file

Any thoughts on how to fix this?

+1 on the grep name error:

ERROR: Encountered error (1) while running the following:
           grep "name"
       (at line 43 of file ./

rodw commented Feb 22, 2016

@CChron @dpflucas - I haven't encountered this issue myself, but per there may be something wrong with one or more of your GHBU_UNAME, GHBU_PASSWD, GHBU_API ro GHBU_ORG parameters (causing no input to the grep call, for example).

Manually running the equivalent of:

curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos -q

may give you a more easily digestible error message or expose a more obvious problem.


rodw commented Feb 22, 2016

@CChron @dpflucas - More generally, if that curl command fails to generate output to STDOUT for any reason you might encounter an error in the grep part of that line (#43).

Zeretil commented Apr 19, 2016

The script works perfectly, thanks. But I'm not really understanding what it is I'm downloading. In the GIT that is downloaded, there isn't much to be seen. Example included of what is in the Git. What am I missing here?


The script uses --mirror, which implies --bare. I was able to restore a working tree by following the instructions at – specifically, by cloning locally with git clone /path/to/backup-dir.git.

A small change that makes it clone the repos of a user and all repos he's a contributor of. Doesn't matter who the owner is and whether they're private or not.

Hello, I ran the command that @rodw mentioned and I got a list of all the repos in the organization but when I run the script it gets the same error message others got.


Fetching list of repositories for ...ERROR: Encountered error (1) while running the following:
grep "name"
(at line 46 of file ./

Alright I figured it out. I had left the brackets in rather than removing them. After removing the brackets I had permission problems. So then I linked my ssh key to my github account as seen here:

michael-dev2rights commented Aug 17, 2016

Hi; I made some changes to this to make it pass shellcheck. @rodw, would you be able to merge these back into your gist?

muthiahr commented Dec 7, 2016


curl --silent -u : returns the following

"message": "Bad credentials",
"documentation_url": ""

But the credentials I am trying with is valid.

Thank you for the great script.
One problem: sometimes curl fails inside pipe (REPOLIST=curl|grep|awk|sed), but the script continues as if it's OK, for not checking PIPESTATUS.

tdiprima commented Mar 3, 2017

Backing up a user's repositories (line 48) the url should actually say users (plural, just like orgs). I know. Seems weird. But it's true.

eromerog commented May 25, 2017

Hi!, I don't want to waste your time, but how does the code above work? It just runs by executing in a command line by changing the variables of my github account? If you know something about a tutorial or something like that it would be great :)

tripu commented Dec 8, 2017

We think GH might have changed the API v3 recently a bit: now, when retrieving the list of repos of an organisation, detailed info about the repo's licence is included (whereas before that wasn't there, or at least not with so much detail). As a result, results now may include two instances of the string name: one for the name of the repo, another one for the name of the licence. grep would wrongly take both. Because of that, our particular version of this script has been failing for a few days now (we believe around 29 Nov 2017).

This fork fixes that (by using jq, which is available in Debian & Ubuntu repositories, instead of grep for this).

tmichaud-accesso commented Dec 14, 2017

For those who don't wish to use jq (they should) - it's fairly easy to use a python script to handle this.

The line:

REPOLIST_TMP=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?page=${PAGE}\&per_page=90 -q -k | grep "\"name\"" | awk -F': "' '{print $2}' | sed -e 's/",//g'`

Needs to be modified to:

REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?page=100 -q  | ./`

with being:

#! /usr/bin/env python2

#Takes a JSON file (a list of objects) and pulls out only the 'name' key/value pair of each object - printing out the value

import json
import sys

data = json.load(sys.stdin)
for x in data:
	print x['name']

marchenkov commented Dec 21, 2017

Also you can install jq and modify

REPOLIST=`check curl --silent -u $GHBU_UNAME:$GHBU_PASSWD ${GHBU_API}/orgs/${GHBU_ORG}/repos\?page=100 -q  | jq ".[] .name"|sed -e 's/\"//g'`

One thing that I noticed when attempting to backup our org's repos is that on line 46 check grep "\"name\"" was pulling the names of our licenses and was attemping to backup repos named Apache and MIT which didn't exist.

When we checked the output we noticed that grepping for the name pulled up the line for labels in the output.

Modifying line 46 to:

REPOLIST_TMP=`check curl --silent ${GHBU_API}/orgs/${GHBU_ORG}/repos\?${GHBU_APIOPTS}page=${PAGE}\&per_page=90 -q -k | grep "\"full_name\"" | awk -F': "' '{print $2}' | sed -e 's/",//g' | sed -e 's/<org-name>\///g'`

resolved those issues for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment