Skip to content

Instantly share code, notes, and snippets.

@DominikPeters
Created February 24, 2022 23:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DominikPeters/132deb50ea47fa8697baad8a2d23f22e to your computer and use it in GitHub Desktop.
Save DominikPeters/132deb50ea47fa8697baad8a2d23f22e to your computer and use it in GitHub Desktop.
Remove JSTOR watermark from PDF
# first install pdftk. For recent MacOS version, get it here: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg
import os
os.system(f"pdftk input.pdf output uncompressed.pdf uncompress")
with open("fixed.pdf", "wb") as outfile:
for line in open("uncompressed.pdf", "rb"):
patterns = [
b"This content downloaded from",
b"0.0.0.0", # enter displayed IP address here
b"All use subject to"
]
for pattern in patterns:
if pattern in line:
line = b"()Tj"
outfile.write(line)
os.system(f"pdftk fixed.pdf output book.pdf compress")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment