This document describes several shell pipelines for converting PDF files to any format.
I'm not sure if it's true for all people, but my e-reader sucks at displaying PDF --- which is, in all reality, a giant executable file (we'll discuss this soon). Also, there's dozens of other reasons one may wish to convert a PDF to a better 'text format'. Let's say, you wanna put it up on your website, feed it to a mathematical optimization model, feed it to an script, etc.
Before you read this document, yes, I know there is a utility, nay, dozens that converty PDFs directly to text (like pdftotext
). I ALSO know that. there are millions, if not BILLIONS of crappy web services that serve you a malware on the platter alongisde converting the files. So let's not talk about them! It's about "owning" your software, read this!
This is not meant to be a description or history of PDF files, you can consult Sahih Al-Bukhari f