Skip to content

Instantly share code, notes, and snippets.

@azizasm
Created April 6, 2018 10:52
Show Gist options
  • Save azizasm/75185adc72683727d21b51666be23e14 to your computer and use it in GitHub Desktop.
Save azizasm/75185adc72683727d21b51666be23e14 to your computer and use it in GitHub Desktop.
WSJ pdf watermark remover
#!/bin/bash
interactive=
filename=~/removePdfWatermark-WSJ2.sh
bfile=$(basename $1)
#cd /home/osboxes/ZysProj/pdf
# the directory of the script
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
#echo "the directory of the script"
#echo $DIR
#cd $DIR
# the temp directory used, within $DIR
# omit the -p parameter to create a temporal directory in the default location
WORK_DIR=`mktemp -d -p "$DIR"`
# check if tmp dir was created
if [[ ! "$WORK_DIR" || ! -d "$WORK_DIR" ]]; then
echo "Could not create temp dir"
exit 1
fi
#echo "tmp dir"
#echo $WORK_DIR
cp $1 $WORK_DIR
echo "input file name = $bfile"
#bfilename=$(basename $filename)
# Un-compress
pdftk $WORK_DIR/$bfile output $WORK_DIR/doc.unc.pdf uncompress
# remove watermark in uncompressed file
echo "remove watermark in uncompressed file..."
sed -E -e "s/For personal,|non-commercial use only.|Do not edit, alter or reproduce. For commercial reproduction or distribution, contact Dow Jones Reprints & Licensing at \\\\\(800\\\\\) 843-0008 or|www.djreprints.com//g" < $WORK_DIR/doc.unc.pdf > $WORK_DIR/tmp.pdf
#Compress pdf
echo "Compress pdf..."
pdftk $WORK_DIR/tmp.pdf output $WORK_DIR/$bfile compress
#copy file back
cp $WORK_DIR/$bfile /media/HostDownload/
echo "New file been created. processed/$bfilename"
#cleanup
#rm doc.unc.pdf tmp.pdf
# deletes the temp directory
function cleanup {
rm -rf "$WORK_DIR"
echo "Deleted temp working directory $WORK_DIR"
}
trap cleanup EXIT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment