Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
git pre-commit hook for stripping output from IPython notebooks
#!/usr/bin/env python
"""strip outputs from an IPython Notebook
Opens a notebook, strips its output, and writes the outputless version to the original file.
Useful mainly as a git filter or pre-commit hook for users who don't want to track output in VCS.
This does mostly the same thing as the `Clear All Output` command in the notebook UI.
LICENSE: Public Domain
import io
import sys
# Jupyter >= 4
from nbformat import read, write, NO_CONVERT
except ImportError:
# IPython 3
from IPython.nbformat import read, write, NO_CONVERT
except ImportError:
# IPython < 3
from IPython.nbformat import current
def read(f, as_version):
return, 'json')
def write(nb, f):
return current.write(nb, f, 'json')
def _cells(nb):
"""Yield all cells in an nbformat-insensitive manner"""
if nb.nbformat < 4:
for ws in nb.worksheets:
for cell in ws.cells:
yield cell
for cell in nb.cells:
yield cell
def strip_output(nb):
"""strip the outputs from a notebook object"""
nb.metadata.pop('signature', None)
for cell in _cells(nb):
if 'outputs' in cell:
cell['outputs'] = []
if 'prompt_number' in cell:
cell['prompt_number'] = None
return nb
if __name__ == '__main__':
filename = sys.argv[1]
with, 'r', encoding='utf8') as f:
nb = read(f, as_version=NO_CONVERT)
nb = strip_output(nb)
with, 'w', encoding='utf8') as f:
write(nb, f)
# strip output of IPython Notebooks
# add this as `.git/hooks/pre-commit`
# to run every time you commit a notebook
# requires `nbstripout` to be available on your PATH
# LICENSE: Public Domain
if git rev-parse --verify HEAD >/dev/null 2>&1; then
# Initial commit: diff against an empty tree object
# Find notebooks to be committed
NBS=`git diff-index -z --cached $against --name-only | grep '.ipynb$' | uniq`
for NB in $NBS ; do
echo "Removing outputs from $NB"
nbstripout "$NB"
git add "$NB"
exec git diff-index --check --cached $against --
Copy link

minrk commented Aug 20, 2013

Or as a git filter instead:

(from @JanShulz)

Add this to your .git/config:

[filter "stripoutput"]
    clean = "/path/to/nbconvert/nbstripout"

and a .gitattributes file with

*.ipynb filter=stripoutput

Copy link

cfriedline commented Sep 17, 2013

for the git filter, I made the following changes to nbstripout:

nb =, 'json')
nb = strip_output(nb)
current.write(nb, sys.stdout, 'json')

Copy link

sotte commented Jan 31, 2014

To get it to work for me I hat to add -a to grep and remove the $ in the git hook. Dunno why, but now it works.

NBS=`git diff-index -z --cached $against --name-only | grep -a '.ipynb' | uniq`

Copy link

SylvainCorlay commented Apr 22, 2014

Line 23 of pre-commit, one needs to replace
for NB in NBS
for NB in $NBS
and also to make the change of @sotte.

Copy link

ketch commented Sep 29, 2014

Any chance this will become part of nbconvert?

Copy link

justmytwospence commented Jan 30, 2015

Is there any way to commit the stripped version but leave output in your working directory

Copy link

aarontran commented Feb 23, 2015

On OS X 10.10, I couldn't get NBS=git diff-index -z ... | grep ...` to work with null character separators, so here's one workaround in bash:



while IFS= read -r -d '' file; do
    if [[ "$file" =~ $pat ]]; then
        printf 'Removing outputs from %q\n' "$file";
        nbstripout "$file"
        git add "$file"
done < <(git diff-index -z --cached $against --name-only)

In nbstripout I also made the following changes, though this probably depends on individual taste. Cell toggling isn't reset by clearing output in the notebook GUI, so toggle states may get versioned even if no output is present. Popping prompt_number matches notebook gui behavior (in IPython 2.4.1).

if 'prompt_number' in cell:
if 'collapsed' in cell:
    cell['collapsed'] = False

Copy link

petered commented Mar 9, 2015

the pre-commit hook approach didn't work for me (the grep somehow found .py files, but only if there was a .ipynb in the commit..) but filter seems cleaner anyway. Here's what I did to get it working:

I modified cfriedline's nbstripout file slightly to give an informative error when you can't import the latest IPython:
And added it to my repo, lets say in ./relative/path/to/nbstripout

Also added the file .gitattributes file to the root of the repo, containing:

*.ipynb filter=stripoutput

And created a containing

git config filter.stripoutput.clean "$(git rev-parse --show-toplevel)/relative/path/to/nbstripout" 
git config filter.stripoutput.smudge cat
git config filter.stripoutput.required true

And ran source The fancy $(git rev-parse...) thing is to find the local path of your repo on any (Unix) machine.

Copy link

waylonflinn commented Apr 13, 2015

Slightly modified method that works with the new notebook format (v4) used in iPython 3

The essential difference is an added check for the presence of the worksheets object on the root.

Copy link

dietmarw commented Jul 13, 2015

I've created a version that removes the whole cell. Although I have to admit the way I track the index is not at all optimal and there might be better ways making proper use of the API. Feedback welcome:

Copy link

kynan commented Sep 12, 2015

I have added documentation, an nbstripout install command to install the filter in the current Git repository and turned it into a module with a setuptools script entry point:

How do you feel about publishing that on PyPI @minrk?

Copy link

jond3k commented Sep 14, 2015

I've adapted cfriedline's repo to make it easy to install to any repo as a filter

Copy link

kynan commented Sep 26, 2015

@jond3k Have a look at my repo linked above: it works with v3 and v4 and has an install command to automate the installation in any git repo.

Copy link

minrk commented Dec 21, 2015

@kynan feel free to put it on PyPI. No need to wait for me.

Copy link

kynan commented Jan 21, 2016

@minrk OK, will do, thanks!

Copy link

kynan commented Jan 21, 2016

@minrk Turns out @mforbes beat me to it. We need to decide on a license. Are you happy with MIT?

Copy link

klieret commented Oct 31, 2018

Great snippet, thanks a lot for sharing!

Two suggestions:

  1. Small fix: I guess it should be grep '\.ipynb$' with the . escaped, else it will match anything
  2. Also add | tr -d '\000' | before grep: NBS=`git diff-index -z --cached $against --name-only | tr -d '\000' | grep '\.ipynb$' | uniq

The second point is because there will be cases where grep considers the input binary ( This happens to me when using zsh (i.e. getting Binary file (standard input) matches from grep instead of the matchiing parts)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment