Skip to content

Instantly share code, notes, and snippets.

@lanrat
Created April 17, 2014 02:31
Show Gist options
  • Save lanrat/10948940 to your computer and use it in GitHub Desktop.
Save lanrat/10948940 to your computer and use it in GitHub Desktop.
script to crop PDFs with briss
#!/usr/bin/env bash
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
BRISS="${DIR}/briss-0.9-sakin-1/briss-0.9.jar"
JPATH="$(which java)"
#get list of all non-cropped PDFs
pdfs="$(find ${DIR} -type f -name "*.pdf" | grep -v "cropped")"
#number of documents cropped
count=0
fail=0
for pdf in $pdfs
do
#define some variables
fullname=$(basename "${pdf}")
filedir=$(dirname "${pdf}")
filename="${fullname%.*}"
filecropped="${filedir}/${filename}_cropped.pdf"
#check to see if cropped file does not exist
if [ ! -s ${filecropped} ];
then
echo "Cropping ${filename}"
#crop the pdf and indent the output
${JPATH} -jar ${BRISS} -2 -s ${pdf} -d ${filecropped} | sed -e 's/^/\t/'
if [ -s ${filecropped} ];
then
((count++))
else
echo "Failed: ${filename}"
((fail++))
rm $filecropped
fi
fi
done
if [ $count -gt 0 ];
then
echo "${count} Documents Cropped!"
else
echo "No Documents Cropped!"
fi
if [ $fail -gt 0 ];
then
echo "${fail} Documents Failed!"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment