Last active
January 27, 2023 01:55
-
-
Save tavinus/1a2ae53441b3fa725e93ec043b8b1112 to your computer and use it in GitHub Desktop.
Split each PDF page into a new PDF
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
################################################################ | |
# | |
# Gustavo Arnosti Neves | |
# https://github.com/tavinus | |
# | |
# Usage ./pdfSplit.sh file.pdf | |
# ./pdfSplit.sh -c 20 file.pdf # if page count fails | |
# | |
# Download | |
# wget 'https://gist.githubusercontent.com/tavinus/1a2ae53441b3fa725e93ec043b8b1112/raw/pdfSplit.sh' && chmod +x pdfSplit.sh | |
# curl -O -J -L 'https://gist.githubusercontent.com/tavinus/1a2ae53441b3fa725e93ec043b8b1112/raw/pdfSplit.sh' && chmod +x pdfSplit.sh | |
# | |
# Sys Install | |
# sudo cp pdfSplit.sh /usr/bin/pdfsplit && sudo chmod +x /usr/bin/pdfsplit | |
# | |
################################################################ | |
if [[ "$1" = "-c" ]]; then | |
pageCount=$2 | |
shift ; shift | |
else | |
pageCount=0 | |
fi | |
inFile="$1" | |
if [[ ! -f "$1" ]]; then | |
printf "%s\n" "Error! Invalid input file! $1" >&2 | |
exit 1 | |
fi | |
if [[ "$pageCount" = "0" ]] || [[ -z "$pageCount" ]]; then | |
# grep may not work some times | |
# pageCount=$(strings < "$inFile" | sed -n 's|.*/Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' | sort -rn | head -n 1) | |
# Ghostscript implementation | |
pageCount=$(gs -q -dNODISPLAY -dBATCH -sFileName="$inFile" -c "FileName (r) file runpdfbegin pdfpagecount = quit" 2>/dev/null) | |
fi | |
if [[ -z "$pageCount" ]] || [[ $pageCount -eq 0 ]]; then | |
printf "%s\n" "Error! Invalid page count! $pageCount" >&2 | |
exit 1 | |
fi | |
LOGFILE="$inFile""_split.log" | |
PAD=" " | |
ret=0 | |
pageChars="$pageCount" | |
pageChars=${#pageChars} | |
#gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=14 -dLastPage=17 -sOutputFile=OUTPUT.pdf | |
gsSplit() { | |
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=$1 -dLastPage=$1 -sOutputFile="${inFile%.pdf}_P$1"".pdf" "$inFile" >> "$LOGFILE" 2>&1 | |
((ret+=$?)) | |
} | |
padNum() { | |
local diff="$(($pageChars-${#1}))" | |
printf "%s" "${PAD:0:$diff}$1" | |
} | |
printf "\n%s\n%s\n\n" "Processing $pageCount pages in" "$inFile" | |
i=1 | |
for ((i=1 ; i<=$pageCount ; i++)); do | |
printf "%s " "$(padNum $i)" | |
if [[ $(($i%10)) -eq 0 ]]; then | |
printf "%s\n" "" | |
fi | |
gsSplit $i | |
done | |
[[ $(($(($i-1))%10)) -ne 0 ]] && printf "\n" "" | |
printf "\n%s\n%s\n" "All done!" "$(($i-1)) files processed!" | |
exit $ret |
To force a page count:
./pdfSplit -c <page_count> file.pdf
Example for 20 pages on a 71-page file:
$ ./pdfSplit.sh -c 20 rpt_list_rfer_completo_indiv1porfolha.pdf
Processing 20 pages in
rpt_list_rfer_completo_indiv1porfolha.pdf
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
All done!
20 files processed!
Any page that does not exist will generate a blank PDF, but no error.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Example run:
Files will have the same name as the input PDF
with a page number suffix, as in:
Ghostscript log will be at