Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
require "pdf/inspector" # gem install pdf-inspector
text = File.binread("2016WayneCountyTaxLiens.pdf")
text_analysis = PDF::Inspector::Text.analyze(text)
File.write("dump.txt", text_analysis.strings.join)
# For more, see https://github.com/prawnpdf/pdf-inspector
# and also https://github.com/yob/pdf-reader
#
# PDF::Reader can be used to build a streaming parser, and possibly use different states in document to get a better dump
# (i.e. you could use it to look for where the natural breaks in the document are by analyzing what's being drawn)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.