Skip to content

Instantly share code, notes, and snippets.

@bubbobne
Forked from pbugnion/ ipython_notebook_in_git.md
Last active March 26, 2019 13:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bubbobne/5a6b9f4959078e523fb6c58d5758fe4f to your computer and use it in GitHub Desktop.
Save bubbobne/5a6b9f4959078e523fb6c58d5758fe4f to your computer and use it in GitHub Desktop.
Keeping IPython notebooks under Git version control
#!/usr/bin/env python
"""
Suppress output and prompt numbers in git version control.
This script will tell git to ignore prompt numbers and cell output
when looking at ipynb files if their metadata contains:
"git" : { "suppress_output" : true }
The notebooks themselves are not changed.
See also this blogpost: http://pascalbugnion.net/blog/ipython-notebooks-and-git.html.
Usage instructions
==================
1. Put this script in a directory that is on the system's path.
For future reference, I will assume you saved it in
`~/scripts/ipynb_drop_output`.
2. Make sure it is executable by typing the command
`chmod +x ~/scripts/ipynb_drop_output`.
3. Register a filter for ipython notebooks by
putting the following line in `~/.config/git/attributes`:
`*.ipynb filter=clean_ipynb`
4. Connect this script to the filter by running the following
git commands:
git config --global filter.clean_ipynb.clean ipynb_drop_output
git config --global filter.clean_ipynb.smudge cat
To tell git to ignore the output and prompts for a notebook,
open the notebook's metadata (Edit > Edit Notebook Metadata). A
panel should open containing the lines:
{
"name" : "",
"signature" : "some very long hash"
}
Add an extra line so that the metadata now looks like:
{
"name" : "",
"signature" : "don't change the hash, but add a comma at the end of the line",
"git" : { "suppress_outputs" : true }
}
You may need to "touch" the notebooks for git to actually register a change, if
your notebooks are already under version control.
Notes
=====
This script is inspired by http://stackoverflow.com/a/20844506/827862, but
lets the user specify whether the ouptut of a notebook should be suppressed
in the notebook's metadata, and works for IPython v3.0.
"""
import sys
import json
nb = sys.stdin.read()
json_in = json.loads(nb)
nb_metadata = json_in["metadata"]
def strip_output_from_cell(cell):
if "outputs" in cell:
cell["outputs"] = []
if "execution_count" in cell:
del cell["execution_count"]
for cell in json_in["cells"]:
strip_output_from_cell(cell)
json.dump(json_in, sys.stdout, sort_keys=True, indent=1, separators=(",",": "), ensure_ascii=True)

This gist lets you keep IPython notebooks in git repositories. It tells git to ignore prompt numbers and program outputs when checking that a file has changed.

To use the script, follow the instructions given in the script's docstring.

For further details, read this blogpost.

The procedure outlined here is inspired by this answer on Stack Overflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment