Skip to content

Instantly share code, notes, and snippets.

@lingpri
Last active February 13, 2021 18:33
Show Gist options
  • Save lingpri/4afd839066d0ba90ea0f977a88424675 to your computer and use it in GitHub Desktop.
Save lingpri/4afd839066d0ba90ea0f977a88424675 to your computer and use it in GitHub Desktop.
def convert_pdf_to_text(filename):
count = 0
text = ""
pdfFileObj = open(filename,'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
num_pages = pdfReader.numPages
while count < num_pages:
pageObj = pdfReader.getPage(count)
count +=1
text += pageObj.extractText()
return text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment