Skip to content

Instantly share code, notes, and snippets.

@kkt-ee
Created January 1, 2020 17:06
Show Gist options
  • Save kkt-ee/8e6de31057fccd75aa77b9b9571f9c1e to your computer and use it in GitHub Desktop.
Save kkt-ee/8e6de31057fccd75aa77b9b9571f9c1e to your computer and use it in GitHub Desktop.
read pdf file with PyPDF2
#pdf open/read
#parse
import PyPDF2 as pdf
def pdfRead(file):
pdfobj = open(file, 'rb')
pdfread = pdf.PdfFileReader(pdfobj)
totpages = pdfread.numPages
pagelist =[]
for page in range(0, totpages):
pageobj = pdfread.getPage(page)
pagelist = pagelist + [pageobj.extractText()]
return pagelist
x = pdfRead('/root/Downloads/_toSD/Deep_Learning_with_Python.pdf')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment