stefanschmidt/remove-annotations.sh

Last active February 1, 2024 17:01

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/stefanschmidt/5248592.js"></script>
Save stefanschmidt/5248592 to your computer and use it in GitHub Desktop.

Download ZIP

Remove all annotations from a PDF document

Raw

remove-annotations.sh

	pdftk original.pdf output uncompressed.pdf uncompress
	LANG=C sed -n '/^\/Annots/!p' uncompressed.pdf > stripped.pdf
	pdftk stripped.pdf output final.pdf compress

swenson commented Oct 21, 2013

Sweet! This worked great for me.

Gabriel-p commented Sep 16, 2016

Thank for sharing this script, it worked perfectly!

JohnRobson commented Mar 24, 2017

Thank you

dvska commented Apr 24, 2017

sed: RE error: illegal byte sequence
hmmm

naveendennis commented May 9, 2017

Works perfectly! Thank you

wenlibin02 commented Jun 20, 2017 •

edited

Loading

It works, thank you very much.
I would suggest using LANG=C sed -i '/^\/Annots/d' uncompressed.pdf to modify the uncompressed.pdf in place.

Dabolus commented Mar 28, 2018

For those getting the Illegal byte sequence error, try adding LC_CTYPE=C at the beginning of the second command as well

e.g. LANG=C LC_CTYPE=C sed -n '/^\/Annots/!p' uncompressed.pdf > stripped.pdf

rafaelbeirigo commented Apr 1, 2018

Thanks!

hruzee commented Aug 8, 2018

Thaank you! Worked great 👍

mstrauss commented Sep 21, 2018

Just be aware that the annotation text (there is any) remains in the file. It's just not visible any more.

KlaraSaary commented Jan 3, 2019

Thanks! Here a for loop for cleaning several pdfs:
for file in *.pdf; do pdftk "$file" output un.pdf uncompress; LANG=C sed -n '/^/Annots/!p' un.pdf > str.pdf; pdftk str.pdf output "$file" compress; echo "Done $file"; done;

davidcesarino commented Mar 26, 2020

This did not work for me. I had to remove all "/Type /Annot" commands as well in sed for the yellow annotations to disappear.

faridcher commented Sep 27, 2020

A faster (in-memory) way is to use a shell pipeline:

pdftk in.pdf output - uncompress | sed '/^\/Annots/d' | pdftk - output out.pdf compress

nisarkhanatwork commented Nov 25, 2020

Thank you! It is cool!

hdatma commented Jan 27, 2021

Be aware that pdftk requires gcj, which was deprecated in 2017. This is old software that needs to be updated.

chrisgrieser commented Jul 2, 2021

Just for convenience for anyone finding this via google like me: This is the code to remove all annotations from all pdfs in a directory.

# these are needed on Mac
export LC_CTYPE=C
export LANG=C

# cd /directory/with/pdfs/

for file in *.pdf
do
    outname=`sed -e "s/\.pdf$/_.pdf/"<<<"$file"`
    pdftk $file output - uncompress | sed '/^\/Annots/d' | pdftk - output $outname compress
    echo "$file: done"
done

toehold commented Dec 2, 2021

Is it possible to reduce the opacity?

muelli commented Jun 3, 2022

this leave me with a PDF with a broken xref table :(

pdfcpu annotations remove my.pdf

works reasonably well :)
https://pdfcpu.io/annot/annot

jfines commented Aug 23, 2023

Still works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment