- hummus - c++ pdf manipulator
- mimeograph - api on a conglomeration of tools (poppler, tesseract, imagemagick etc)
- pdftotextjs - wrapper around pdftotext
- pdf-text-extract - another wrapper around pdftotext
- pdf-extract - wrapper around pdftotext, pdftk, tesseract, ghostscript
- pdfutils - poppler wrapper
- scissors - pdftk, ghostscript wrapper w/ high level api
- textract - pdftotext wrapper
- pdfiijs - pdf to inverted index using textiijs and poppler
- pdf2json - pure js pdf to json
This is super useful Max. Do you have any experience what any of these are like esp the pure JS one?
I'm really interested in finding a good node lib for PDF to "machine-readable something" lib and would prefer pure JS (if possible).