Skip to content

Instantly share code, notes, and snippets.

@gloriousDan
Last active March 20, 2023 22:50
Show Gist options
  • Save gloriousDan/c58958fb9f72214675f2697f955fff2e to your computer and use it in GitHub Desktop.
Save gloriousDan/c58958fb9f72214675f2697f955fff2e to your computer and use it in GitHub Desktop.
PDF concat with section rewrites

Concat PDFs

This script concatenates PDFs and preseverses their sections (like pdftk normally does)

Additionally it writes the current pdf's title (or file name if it doesn't have a title) as a section on the start page of the pdf in the merged pdf.

All other sections are then moved down a level to preserve the section hierarchy within the document

#!/bin/bash
PDFS=$1
TMP="/tmp/pdfcat"
rm -r "$TMP"
mkdir "$TMP"
for i in $(ls -v "$1"*.pdf); do
echo $i
pdftk $i data_dump output "$TMP/$i.txt"
TITLE=$(grep "InfoKey: Title" "$TMP/$i.txt" -A 1 | grep InfoValue | cut -d " " -f "2-")
TITLE=$(echo $TITLE | cut -d "-" -f "2-")
TITLE=${TITLE:=$i}
TITLE=${TITLE:="No Title"}
FIRST_BOOKMARK=$(grep BookmarkBegin "$TMP/$i.txt" -n | head -n 1 | cut -d: -f1)
grep BookmarkBegin "$TMP/$i.txt" -n | head -n 1 | cut -d: -f1
head -n $(expr $FIRST_BOOKMARK - 1) "$TMP/$i.txt" >> $(echo "$TMP/""$i""2.txt")
head -n $(expr $FIRST_BOOKMARK - 1) "$TMP/$i.txt"
echo "BookmarkBegin" >> $(echo "$TMP/""$i""2.txt")
echo "BookmarkTitle: $TITLE" >> $(echo "$TMP/""$i""2.txt")
echo "BookmarkLevel: 1" >> $(echo "$TMP/""$i""2.txt")
echo "BookmarkPageNumber: 1" >> $(echo "$TMP/""$i""2.txt")
# tail -n +22 "$TMP/$i.txt" | >> "$TMP/$($i)2.txt"
tail -n +$FIRST_BOOKMARK "$TMP/$i.txt" | awk '{if ( $1 == "BookmarkLevel:" ) print $1 " " $2 + 1; else print $0 }' >> $(echo "$TMP/""$i""2.txt")
pdftk $i update_info $(echo "$TMP/""$i""2.txt") output "$TMP/$i"
done
pdftk $(ls -v "$TMP"/*.pdf) cat output $2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment