Skip to content

Instantly share code, notes, and snippets.

@jbarratt
Last active April 27, 2023 15:00
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save jbarratt/fa1d3473048e5f856aeb to your computer and use it in GitHub Desktop.
Save jbarratt/fa1d3473048e5f856aeb to your computer and use it in GitHub Desktop.
'nbgrep', search the code of all your ipython notebooks
#!/bin/bash
# usage: nbgrep 'pattern'
SEARCHPATH=~/work/
# 'jq' technique lifted with gratitude
# from https://gist.github.com/mlgill/5c55253a3bc84a96addf
# Break on newlines instead of any whitespace
# IPython Notebook files often have spaces in it
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
if ! type mdfind > /dev/null 2>&1; then
# Use find from findutils
FILES=$(find $SEARCHPATH -name '*.ipynb')
else
# mdfind uses OSX's spotlight search, so it's almost instant
# generate a list of all the ipynb files in any of the directories
FILES=$(mdfind -onlyin $SEARCHPATH -name '.ipynb')
fi
# On the command line we get the argument to search for
PATTERN=$1
for f in $FILES
do
# Use 'jq' to filter out only the code in input cells
# Then remove quoting
# Colorize it with pygments (give it the most context possible to get color right)
# And finally, search the remainder for a given pattern
OUTPUT=$(jq '.worksheets[]?.cells[]? | select(.cell_type=="code") | .input[]?//.input' $f \
| sed 's/^"//g;s/"$//g;s/\\n$//g;s/\\"/"/g;s/\\\\/\\/g;s/\\n/\n/g' \
| pygmentize -l python 2>/dev/null \
| grep $PATTERN)
# If the grep matched anything, print it
if [ $? -eq 0 ]; then
echo -e "$f:\n\n$OUTPUT\n\n"
fi
done
IFS=$SAVEIFS
@gmorain
Copy link

gmorain commented May 30, 2016

updating the do...done code as follows allows for v4 (Jupyter) notebooks inclusion in search results :

    # Check Notebook JSON format first
    NB_VERSION=$(jq '.nbformat' $f)

    if [ $NB_VERSION -eq 3 ]; then
        # IPython notebook JSON format
        OUTPUT=$(jq '.worksheets[]?.cells[]? | select(.cell_type=="code") | .input[]?//.input' $f \
            | sed 's/^"//g;s/"$//g;s/\\n$//g;s/\\"/"/g;s/\\\\/\\/g;s/\\n/\n/g' \
            | pygmentize -l python 2>/dev/null \
            | grep $PATTERN)

    elif [ $NB_VERSION -eq 4 ]; then
        # Jupyter notebook JSON format
        OUTPUT=$(jq '.cells[]? | select(.cell_type=="code") | .source[]?//.source' $f \
            | sed 's/^"//g;s/"$//g;s/\\n$//g;s/\\"/"/g;s/\\\\/\\/g;s/\\n/\n/g' \
            | pygmentize -l python 2>/dev/null \
            | grep $PATTERN)
    fi

@D3f0
Copy link

D3f0 commented Mar 30, 2017

Is there something like this in a pip installable form?

@nishadhka
Copy link

Thanks @jbarratt and @gmorain, it just works in ubuntu, a take

@cramjaco
Copy link

cramjaco commented Mar 2, 2018

I find that when I try to use this function (in Ubuntu Linux), I tend to get lot of repeats of the following. Any idea what gives?

I run ~/Programs/nbgrep.sh "phylo_join"

And get lots of repeats of this:

3 compile errors
error: Invalid character
.worksheets[]?.cells[]? | select(.cell_type=="code") | .input[]?//.input
^
error: Invalid character
.worksheets[]?.cells[]? | select(.cell_type=="code") | .input[]?//.input
^
error: Invalid character
.worksheets[]?.cells[]? | select(.cell_type=="code") | .input[]?//.input
^

@vinayak-mehta
Copy link

Hey everyone, please try out nbcommands, it has a nbgrep command too! And it can simply be installed using pip.

Would love to add any jq feature that it might be missing, please open an issue on the repo for that :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment