Skip to content

Instantly share code, notes, and snippets.

@tsunghanlin
Last active December 4, 2015 01:54
Show Gist options
  • Save tsunghanlin/68456056fd138331b0c7 to your computer and use it in GitHub Desktop.
Save tsunghanlin/68456056fd138331b0c7 to your computer and use it in GitHub Desktop.
Find text in pdf files in some given paths
#!/bin/bash
usage ()
{
echo "Usage: ./find_text_in_pdf.sh pathname pattern"
exit
}
if [ "$#" -ne 2 ]; then
usage
fi
PATTERN=$2
DIR=
BLUE='\033[0;34m'
BROWN='\033[0;33m'
# no color
NC='\033[0m'
for pdffile in `find $1 -name '*.pdf'`
do
if [ `dirname ${pdffile}` != "${DIR}" ]; then
DIR=`dirname ${pdffile}`
printf "${BLUE}IN DIR ${DIR}${NC}\n"
fi
printf "[In File \"${BROWN}`basename ${pdffile}`${NC}\"]\n"
pdftotext ${pdffile} - | grep -i --color $2
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment