Skip to content

Instantly share code, notes, and snippets.

Last active February 4, 2025 11:47
Show Gist options
  • Save Arnie97/3234985d9e85cdbddf1a to your computer and use it in GitHub Desktop.
Save Arnie97/3234985d9e85cdbddf1a to your computer and use it in GitHub Desktop.
Removes the annoying watermarks of's downloaded eBooks
#!/usr/bin/env python3
import sys
import re
import shutil
import argparse
import binascii
# Author: Daxda
# Date: 02.04.2014
# WTF: This is a quick tool I've hacked together to easily remove the meta
# information as well as the annoying link on each page of eBooks
# downloaded from The modified file will hold the
# original file name, and the original file will be renamed to
# 'original.pdf.old'. 'pattern' is the regex pattern which is used to
# remove the annotation elements, the rough structure of it looks
# like this:
# obj
# <<
# /Type /Annot
# /Subtype /Link
# /Rect [ 264 91 348 79 ] # The digits on this line will differ
# /Border [ 0 0 0 ] # The same goes for the digits on this line
# /A <<
# /Type /Action
# /S /URI
# /URI (
# >>
# >>
# endobj
pattern = b'''0a2f54797065202f416e6e6f740a2f53756274797065202f4c696e6b0a2f52656
f2f290a3e3e'''.replace(b'\n', b'').strip()
def remove_evil_links(pdf_data):
'Removes all it-ebook links and metadata from the passed PDF data.'
pdf_data = binascii.hexlify(pdf_data)
# Remove each annotation element inside the PDF file
# (This removes the "clickable" links)
new_data = re.sub(pattern, b'', pdf_data)
# Remove the actual links
# (link elements which are assigned to the annotations)
new_data = new_data.replace(binascii.hexlify(b''), b'')
return binascii.unhexlify(new_data)
def main(args):
args.files = list(set(args.files))
for file_path in args.files:
if not file_path:
if args.verbose:
print('Processing: {0}'.format(file_path))
with open(file_path, 'rb') as input_file:
pdf_data =
except IOError as e:
sys.stderr.write('{0}: {1}\n'.format(file_path, e.strerror))
# Backup the file with a different name
if not args.no_backup:
if args.verbose:
print('Creating backup: {0}.old'.format(file_path))
shutil.move(file_path, '{0}.old'.format(file_path))
# Modify the PDF file
new_pdf_data = remove_evil_links(pdf_data)
# Save the new file
with open(file_path, 'wb') as out_file:
if args.verbose:
print('Saving modified file: {0}'.format(file_path))
except KeyboardInterrupt:
if __name__ == '__main__':
parser = argparse.ArgumentParser()
'-f', '--files',
help='One or more PDF files to remove it-ebook watermarks.',
nargs='*', required=True
'-n', '--no-backup',
help='Disables the creating of backups for the files ' +
'which are being processed.',
'-v', '--verbose',
args = parser.parse_args()
Copy link

reavon commented Mar 5, 2017

Can you create a script that removes watermarks from here

Copy link

denskiz commented Jul 22, 2017

Any instructions for how to run this script?

Copy link

bdk907 commented Jan 7, 2018

Absolutely amazing... I just did this via a hex-editor using search & replace, what a pain... This works much easier with less fuss and headless... It should be pretty easy to adapt for other water-marks as well.

@denskiz - from a terminal, in the same directory you downloaded this script, run:

$ python -h

Copy link

jhorgint commented Apr 3, 2018

I've run the script and it creates a "*.pdf.old" but does not remove any mark in the original Pdf

Copy link

phplaw commented Dec 24, 2018

This also remove the table content , bookmarks of ebook

Copy link

@DonaldTsang great list! Which one do you think works best?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment