Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save rmi1974/b26244edbd8ec3c8594bd21b93edbaf9 to your computer and use it in GitHub Desktop.
Save rmi1974/b26244edbd8ec3c8594bd21b93edbaf9 to your computer and use it in GitHub Desktop.
How to use git-annex and bup for managing distributed content #git #git-annex #bup #commandlinefu

How to use git-annex and bup for managing distributed content

git-annex

git-annex allows managing files with git, without checking the file contents into git. While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

Using bup in conjunction with git-annex

bup: It backs things up

Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).

Courtesy of Bup repositories in git-annex

Bup and Linux distros with Python 2 EOL (archive.org)

sudo dnf install fuse-devel libacl-devel perl-Time-HiRes

virtualenv -p python2 ~/.venv
source ~/.venv/bin/activate
pip2 install pyxattr pylibacl

./configure
make
make long-check

One-time repository setup

The commands are to be executed from git-annex repo directory.

git init
git annex init

Content of .gitignore:

/git-bup/bupindex*
/git-bup/objects/pack/bup.bloom
/git-bup/objects/pack/midx*midx
/git-bup/objects/tmp*.pack
/git-bup/index-cache/
git add .gitignore
git commit -m "Add .gitignore" .gitignore

Save typing:

alias mybup='bup -d $PWD/git-bup'
mybup init

Backup content

Prerequisite

If the bup repo contains VirtualBox images (VDI) compact them before. In the VM:

  • cleanup/delete unused data
  • defragment
  • zero out free disk space by running sdelete c: -z

Offline:

vboxmanage list hdds
vboxmanage modifymedium disk foobar.vdi --compact

Update the bup filesystem index

Generally:

mybup index -uvx <directory-to-backup>

VirtualBox:

mybup index -uvvx --exclude-rx='/Logs/$' --exclude-rx='.log' ~/.VirtualBox

Thunderbird:

export TB_PROFILE_DIR=$(grep 'Path=' "$HOME/.thunderbird/profiles.ini" | \
                                sed 's/^Path=//' | sed 's/\r//' | sed 's/\n//')

mybup index -uvvx \
    --exclude-rx='/cache2/$' \
    --exclude-rx='/minidumps/$' \
    --exclude-rx='/saved-telemetry-pings/$' \
    --exclude-rx='/Cache.Trash.*/$' \
    $TB_PROFILE_DIR 

Firefox:

export FF_PROFILE_DIR=$(grep 'Path=' "$HOME/.mozilla/firefox/profiles.ini" | \
                                sed 's/^Path=//' | sed 's/\r//' | sed 's/\n//')

mybup index -uvvx \
    --exclude-rx='/cache2/$' \
    --exclude-rx='/minidumps/$' \
    --exclude-rx='/saved-telemetry-pings/$' \
    --exclude-rx='/datareporting/$' \
    --exclude-rx='/crashes/$' \
    $FF_PROFILE_DIR

Create new backup set

Saves the contents of the given files or paths into a new backup set and optionally names that backup set.

mybup save -n <backup-set-name> --strip <directory-to-backup>

More examples:

mybup save -n virtualbox --strip ~/.VirtualBox

mybup save -n firefox-profile --strip $FF_PROFILE_DIR 

mybup save -n thunderbird-profile --strip $TB_PROFILE_DIR 

Optionally generate recovery blocks using PAR2:

mybup fsck -g 

Commit the backup set:

git annex add git-bup/objects/pack

git add git-bup

git commit -m "Backup on $(date)"

Distribute content to remotes

  • mount backup disk filesystem(s)
  • enable special remotes
git annex sync --content

Special git-annex Google drive remote for backup

git-annex special remote for GoogleDrive

git-annex-remote-googledrive adds direct and fast support for Google Drive to git-annex and comes with some awesome new features.

One-time setup

The commands are to be executed from git-annex repo directory.

⚠️ All content stored in the cloud must be encrypted

git-annex-remote-googledrive setup

gpg --gen-key

gpg --list-keys --list-options show-uid-validity

git annex initremote googledrive type=external externaltype=googledrive layout=nested \
     prefix=$(basename $PWD) chunk=50MiB encryption=hybrid keyid=EF8A8E9C mac=HMACSHA512

Sync content

git annex enableremote googledrive mute-api-lockdown-warning=true layout=nested

git annex sync --content

Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment