Skip to content

Instantly share code, notes, and snippets.

@rmi1974
Last active December 15, 2023 14:15
Show Gist options
  • Save rmi1974/08ff06eeca729bbd0a8e3c4d8e1adafd to your computer and use it in GitHub Desktop.
Save rmi1974/08ff06eeca729bbd0a8e3c4d8e1adafd to your computer and use it in GitHub Desktop.
Git-annex useful commands #git #git-annex #commandlinefu

Git-annex useful commands

git-annex

git-annex allows managing files with git, without checking the file contents into git. While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

Getting content

Get all content from backup remote:

git annex get * --from backup

Consistency checks

time git annex fsck --fast | grep -A 10 -v "ok$"

time git annex fsck | grep -A 10 -v "ok$"

Find non-distributed content

git annex find --not --in=<remote> .

Disk space usage

Check how much disk space the content from backup remote will use when fetched:

git annex info . --not --in here

Migrate to different backend

To migrate from older SHA1E backend to newer SHA256E (default for new repos):

git annex migrate --backend SHA256E *

After migration you might need to run git annex unused and git annex dropunused.

git annex unused

git annex unused | grep -o -P "^    [0-9]+" | xargs git annex dropunused

If there are still files in a specific backend:

$ git annex info
...
backend usage: 
	SHA1E: 29
	SHA256E: 973

Show which remotes contain files with backend=SHA1E:

$ git annex list --inbackend=SHA1E
here
|backup
||github
|||origin
||||web
|||||bittorrent
||||||
_X____ foobar1.txt
...

Purging dead repositories

git annex dead <UUID>

To prune all history relating to all dead remotes

git annex forget --drop-dead

That prunes all history relating to all dead remotes. You need to be running a git-annex that supports this on all computers you use the repos on, or the pruned history will get merged back in.

Recycle dead repo UUIDs

First clone the repo to new location:

git clone foo

Now set "annex.uuid" in freshly created .git/config to the UUID of the dead repo you want to recycle. Do this before you run any git annex command. Now run:

git annex init
git annex fsck
git annex semitrust <uuid>

Sync with master.

Automatically adding metadata

Git-annex's metadata works best when files have a lot of useful metadata attached to them. To make git-annex automatically set the year and month when adding files, run:

git config annex.genmetadata true

A git commit hook can be set up to extract lots of metadata from files like photos, mp3s, etc. Install the extract utility, from libextractor.

Download pre-commit-annex and install it in your git-annex repository as '.git/hooks/pre-commit-annex'. Remember to make the script executable! Run:

git config metadata.extract "artist album title camera_make camera_model orientation video_dimensions image_dimensions"

Running for first time to update already annexed content:

git annex find --format='${file}\n' | sort | \
    awk -vRS= -vFS='\n' '{for (i = 2; i <= NF; i++) print $i}' | \
    xargs -d '\n' bash -x .git/hooks/pre-commit-annex

Now any fields you list in metadata.extract to will be extracted and stored when files are committed.

To get a list of all possible fields, run:

libextractor-extract -L | sed 's/ /_/g'

By default, if a git-annex already has a metadata field for a file, its value will not be overwritten with metadata taken from files. To allow overwriting, run:

git config metadata.overwrite true

Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment