Skip to content

Instantly share code, notes, and snippets.

@dandelin
Created September 18, 2020 00:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dandelin/6e469a061fdd5241f1c749fe93b2e0fa to your computer and use it in GitHub Desktop.
Save dandelin/6e469a061fdd5241f1c749fe93b2e0fa to your computer and use it in GitHub Desktop.
split pdfs
"""
pip install git+https://github.com/dandelin/PyPDF2.git
"""
from glob import glob
from PyPDF2 import PdfFileReader, PdfFileWriter
def split_pdf(path, p):
if path.endswith('_0.pdf') or path.endswith('_1.pdf'):
return
input_pdf = PdfFileReader(path, strict=False)
page_number = input_pdf.numPages
assert p < page_number
pdf0 = PdfFileWriter()
for page in range(0, p):
pdf0.addPage(input_pdf.getPage(page))
pdf1 = PdfFileWriter()
for page in range(p+1, page_number):
pdf1.addPage(input_pdf.getPage(page))
with open(f'{path[:-4]}_0.pdf', 'wb') as fp:
pdf0.write(fp)
with open(f'{path[:-4]}_1.pdf', 'wb') as fp:
pdf1.write(fp)
if __name__ == '__main__':
paths = glob('./*.pdf')
p = 5
for path in paths:
split_pdf(path, p)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment