Last active

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

git pre-commit hook for stripping output from IPython notebooks

View nbstripout
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
#!/usr/bin/env python
"""strip outputs from an IPython Notebook
Opens a notebook, strips its output, and writes the outputless version to the original file.
Useful mainly as a git pre-commit hook for users who don't want to track output in VCS.
This does mostly the same thing as the `Clear All Output` command in the notebook UI.
"""
 
import io
import sys
 
from IPython.nbformat import current
 
def strip_output(nb):
"""strip the outputs from a notebook object"""
nb.metadata.pop('signature', None)
for cell in nb.worksheets[0].cells:
if 'outputs' in cell:
cell['outputs'] = []
if 'prompt_number' in cell:
cell['prompt_number'] = None
return nb
 
if __name__ == '__main__':
filename = sys.argv[1]
with io.open(filename, 'r', encoding='utf8') as f:
nb = current.read(f, 'json')
nb = strip_output(nb)
with io.open(filename, 'w', encoding='utf8') as f:
current.write(nb, f, 'json')
 
View nbstripout
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
#!/bin/sh
#
# strip output of IPython Notebooks
# add this as `.git/hooks/pre-commit`
# to run every time you commit a notebook
#
# requires `nbstripout` to be available on your PATH
#
 
if git rev-parse --verify HEAD >/dev/null 2>&1; then
against=HEAD
else
# Initial commit: diff against an empty tree object
against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi
# Find notebooks to be committed
(
IFS='
'
NBS=`git diff-index -z --cached $against --name-only | grep '.ipynb$' | uniq`
 
for NB in $NBS ; do
echo "Removing outputs from $NB"
nbstripout "$NB"
git add "$NB"
done
)
 
exec git diff-index --check --cached $against --
Owner

Or as a git filter instead:

(from @JanShulz)

Add this to your .git/config:

[filter "stripoutput"]
    clean = "/path/to/nbconvert/nbstripout"

and a .gitattributes file with

*.ipynb filter=stripoutput

for the git filter, I made the following changes to nbstripout: https://github.com/cfriedline/ipynb_template/blob/master/nbstripout

nb = current.read(sys.stdin, 'json')
nb = strip_output(nb)
current.write(nb, sys.stdout, 'json')

To get it to work for me I hat to add -a to grep and remove the $ in the git hook. Dunno why, but now it works.

NBS=`git diff-index -z --cached $against --name-only | grep -a '.ipynb' | uniq`

Line 23 of pre-commit, one needs to replace
for NB in NBS
with
for NB in $NBS
and also to make the change of @sotte.

Any chance this will become part of nbconvert?

Is there any way to commit the stripped version but leave output in your working directory

On OS X 10.10, I couldn't get NBS=`git diff-index -z ... | grep ... to work with null character separators, so here's one workaround in bash:

#!/bin/bash

...

(
pat='\.ipynb$'
while IFS= read -r -d '' file; do
    if [[ "$file" =~ $pat ]]; then
        printf 'Removing outputs from %q\n' "$file";
        nbstripout "$file"
        git add "$file"
    fi
done < <(git diff-index -z --cached $against --name-only)
)

In nbstripout I also made the following changes, though this probably depends on individual taste. Cell toggling isn't reset by clearing output in the notebook GUI, so toggle states may get versioned even if no output is present. Popping prompt_number matches notebook gui behavior (in IPython 2.4.1).

if 'prompt_number' in cell:
    cell.pop('prompt_number')
if 'collapsed' in cell:
    cell['collapsed'] = False
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.