public
Last active

git pre-commit hook for stripping output from IPython notebooks

  • Download Gist
nbstripout
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
#!/usr/bin/env python
"""strip outputs from an IPython Notebook
 
Opens a notebook, strips its output, and writes the outputless version to the original file.
 
Useful mainly as a git pre-commit hook for users who don't want to track output in VCS.
 
This does mostly the same thing as the `Clear All Output` command in the notebook UI.
"""
 
import io
import sys
 
from IPython.nbformat import current
 
def strip_output(nb):
"""strip the outputs from a notebook object"""
for cell in nb.worksheets[0].cells:
if 'outputs' in cell:
cell['outputs'] = []
if 'prompt_number' in cell:
cell['prompt_number'] = None
return nb
 
if __name__ == '__main__':
filename = sys.argv[1]
with io.open(filename, 'r', encoding='utf8') as f:
nb = current.read(f, 'json')
nb = strip_output(nb)
with io.open(filename, 'w', encoding='utf8') as f:
current.write(nb, f, 'json')
pre-commit
Shell
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
#!/bin/sh
#
# strip output of IPython Notebooks
# add this as `.git/hooks/pre-commit`
# to run every time you commit a notebook
#
# requires `nbstripout` to be available on your PATH
#
 
if git rev-parse --verify HEAD >/dev/null 2>&1; then
against=HEAD
else
# Initial commit: diff against an empty tree object
against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi
# Find notebooks to be committed
(
IFS='
'
NBS=`git diff-index -z --cached $against --name-only | grep '.ipynb$' | uniq`
 
for NB in NBS ; do
echo "Removing outputs from $NB"
nbstripout "$NB"
git add "$NB"
done
)
 
exec git diff-index --check --cached $against --

Or as a git filter instead:

(from @JanShulz)

Add this to your .git/config:

[filter "stripoutput"]
    clean = "/path/to/nbconvert/nbstripout"

and a .gitattributes file with

*.ipynb filter=stripoutput

for the git filter, I made the following changes to nbstripout: https://github.com/cfriedline/ipynb_template/blob/master/nbstripout

nb = current.read(sys.stdin, 'json')
nb = strip_output(nb)
current.write(nb, sys.stdout, 'json')

To get it to work for me I hat to add -a to grep and remove the $ in the git hook. Dunno why, but now it works.

NBS=`git diff-index -z --cached $against --name-only | grep -a '.ipynb' | uniq`

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.