Skip to content

Instantly share code, notes, and snippets.

@rufuspollock
Created July 7, 2013 17:37
Show Gist options
  • Save rufuspollock/5944247 to your computer and use it in GitHub Desktop.
Save rufuspollock/5944247 to your computer and use it in GitHub Desktop.
Trying out pdf2json
var nodeUtil = require("util"),
PFParser = require("pdf2json")
;
var pdfParser = new PFParser();
pdfParser.on("pdfParser_dataReady", function(data) {
console.log('here');
console.log(data);
console.log(data.data.Pages[0]);
});
pdfParser.on("pdfParser_dataError", function(error) {
console.error(error);
});
var pdfFilePath = "cache/apbn-2013.pdf";
pdfParser.loadPDF(pdfFilePath);
@max-mapper
Copy link

inside pdfParser_dataReady try:

  var text = data.data.Pages[0].Texts.map(function(t) {
    return t.R[0].T
  })
  console.log(text.join(' '))

@rufuspollock
Copy link
Author

@maxogden - much appreciated!

@rajsekhar8099
Copy link

hello @maxogden @rufuspollock
i want to convert pdf to json. i tried online tools but it was horrible because in converted json file has bulky type data of pdf it's not in order or anything so i want to know how i can convert pdf to json having same look after conversion to json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment