Skip to content

Instantly share code, notes, and snippets.

@tavinus
Last active January 27, 2023 01:55
Show Gist options
  • Save tavinus/1a2ae53441b3fa725e93ec043b8b1112 to your computer and use it in GitHub Desktop.
Save tavinus/1a2ae53441b3fa725e93ec043b8b1112 to your computer and use it in GitHub Desktop.
Split each PDF page into a new PDF
#!/bin/bash
################################################################
#
# Gustavo Arnosti Neves
# https://github.com/tavinus
#
# Usage ./pdfSplit.sh file.pdf
# ./pdfSplit.sh -c 20 file.pdf # if page count fails
#
# Download
# wget 'https://gist.githubusercontent.com/tavinus/1a2ae53441b3fa725e93ec043b8b1112/raw/pdfSplit.sh' && chmod +x pdfSplit.sh
# curl -O -J -L 'https://gist.githubusercontent.com/tavinus/1a2ae53441b3fa725e93ec043b8b1112/raw/pdfSplit.sh' && chmod +x pdfSplit.sh
#
# Sys Install
# sudo cp pdfSplit.sh /usr/bin/pdfsplit && sudo chmod +x /usr/bin/pdfsplit
#
################################################################
if [[ "$1" = "-c" ]]; then
pageCount=$2
shift ; shift
else
pageCount=0
fi
inFile="$1"
if [[ ! -f "$1" ]]; then
printf "%s\n" "Error! Invalid input file! $1" >&2
exit 1
fi
if [[ "$pageCount" = "0" ]] || [[ -z "$pageCount" ]]; then
# grep may not work some times
# pageCount=$(strings < "$inFile" | sed -n 's|.*/Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' | sort -rn | head -n 1)
# Ghostscript implementation
pageCount=$(gs -q -dNODISPLAY -dBATCH -sFileName="$inFile" -c "FileName (r) file runpdfbegin pdfpagecount = quit" 2>/dev/null)
fi
if [[ -z "$pageCount" ]] || [[ $pageCount -eq 0 ]]; then
printf "%s\n" "Error! Invalid page count! $pageCount" >&2
exit 1
fi
LOGFILE="$inFile""_split.log"
PAD=" "
ret=0
pageChars="$pageCount"
pageChars=${#pageChars}
#gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=14 -dLastPage=17 -sOutputFile=OUTPUT.pdf
gsSplit() {
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=$1 -dLastPage=$1 -sOutputFile="${inFile%.pdf}_P$1"".pdf" "$inFile" >> "$LOGFILE" 2>&1
((ret+=$?))
}
padNum() {
local diff="$(($pageChars-${#1}))"
printf "%s" "${PAD:0:$diff}$1"
}
printf "\n%s\n%s\n\n" "Processing $pageCount pages in" "$inFile"
i=1
for ((i=1 ; i<=$pageCount ; i++)); do
printf "%s " "$(padNum $i)"
if [[ $(($i%10)) -eq 0 ]]; then
printf "%s\n" ""
fi
gsSplit $i
done
[[ $(($(($i-1))%10)) -ne 0 ]] && printf "\n" ""
printf "\n%s\n%s\n" "All done!" "$(($i-1)) files processed!"
exit $ret
@tavinus
Copy link
Author

tavinus commented Oct 3, 2018

To force a page count:

./pdfSplit -c <page_count> file.pdf

Example for 20 pages on a 71-page file:

$ ./pdfSplit.sh -c 20 rpt_list_rfer_completo_indiv1porfolha.pdf

Processing 20 pages in
rpt_list_rfer_completo_indiv1porfolha.pdf

 1  2  3  4  5  6  7  8  9 10
11 12 13 14 15 16 17 18 19 20

All done!
20 files processed!

Any page that does not exist will generate a blank PDF, but no error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment