Skip to content

Instantly share code, notes, and snippets.

@salvadorgascon
Created June 11, 2023 12:38
Show Gist options
  • Save salvadorgascon/29a3ae19c7e5f4b5bed6a9ca71ad5025 to your computer and use it in GitHub Desktop.
Save salvadorgascon/29a3ae19c7e5f4b5bed6a9ca71ad5025 to your computer and use it in GitHub Desktop.
Extract Text from PDF file
from PyPDF2 import PdfReader
pdf_reader = PdfReader('archive.pdf')
num_pages_pdf = len(pdf_reader.pages)
for x in range(0, num_pages_pdf-1):
page_object = pdf_reader.pages[x]
text = page_object.extract_text()
print(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment