Skip to content

Instantly share code, notes, and snippets.

@SauloSilva
Created March 3, 2018 06:36
Show Gist options
  • Save SauloSilva/97f7fa0a83acc368f04a0bbfbed5c931 to your computer and use it in GitHub Desktop.
Save SauloSilva/97f7fa0a83acc368f04a0bbfbed5c931 to your computer and use it in GitHub Desktop.
import pdftotext
# Load your PDF
with open("teste.pdf", "rb") as f:
pdf = pdftotext.PDF(f)
# How many pages?
print(len(pdf))
# Iterate over all the pages
for page in pdf:
print(page)
# Read all the text into one string
text = "\n\n".join(pdf)
print(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment