Skip to content

Instantly share code, notes, and snippets.

@ngregoire
Last active December 15, 2023 01:17
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save ngregoire/8c5a569f9c56573f70856028b59d844f to your computer and use it in GitHub Desktop.
Save ngregoire/8c5a569f9c56573f70856028b59d844f to your computer and use it in GitHub Desktop.
Grep through PDF files
#!/bin/bash
# Three arguments: ROOT_DIR, PATTERN, OPTIONS
# Search below $ROOT_DIR for PDF files matching $PATTERN
# $OPTIONS is passed to pdfgrep (ex: grep-pdf . 'some words' -h -C5)
# ROOT_DIR
if [ -z "$1" ]; then
echo "! Argument ROOT_DIR is needed!"
exit 1
else
ROOT_DIR="$1"
fi
# PATTERN
if [ -z "$2" ]; then
echo "! Argument PATTERN is needed!"
exit 1
else
PATTERN=$2
fi
# OPTIONS
OPTIONS="-H -n --warn-empty"
if [ -z "$3" ]; then
true
else
# Pass all remaining arguments
shift
shift
OPTIONS="$OPTIONS $@"
fi
# DO THE STUFF
echo "# Searching for '${PATTERN}' in PDF files below $ROOT_DIR (options '$OPTIONS')"
find $ROOT_DIR -name "*.pdf" -print0 | xargs -0 pdfgrep $OPTIONS "$PATTERN"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment