This is a demonstration of using a git filter set up to automatically fetch data from an arbitrary remote server,
in this case from htsget
.
The necessary setup:
- make
renku-htsget.py
an executable and make it available on the PATH. - create a project with a git repository
- add a
.gitattributes
file with the line
*.bam filter=htsget
- add the file
NA12878_2.bam
below and commit it to the repository - set up the htsget filter in git with
git config --global filter.htsget.smudge="renku-htsget.py smudge %f
git config --global filter.htsget.smudge="renku-htsget.py clean %f
- to replace the pointer file with the content from the server, run
git reset --hard
You should see a brief info screen with the content of the pointer file and the data should be downloaded
from the server. However, if you do a git show
you will still just see the pointer file contents.
You can now push the git repository to a remote and only the contents of the pointer file will be pushed. When you clone it, the content will be automatically downloaded.