Skip to content

Instantly share code, notes, and snippets.

@Riebart
Created September 19, 2019 17:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Riebart/1fb494a868b5f5c87031ac3b3eb83706 to your computer and use it in GitHub Desktop.
Save Riebart/1fb494a868b5f5c87031ac3b3eb83706 to your computer and use it in GitHub Desktop.
Nodejs script to convert Textract output into a PDF using PDFkit
const fs = require('fs');
let rawdata = fs.readFileSync('/working/out4.json');
let data = JSON.parse(rawdata);
// You'll need to twiddle the dpi, page size, and font size to get your stuff to layout properly.
// Assumes uniform font size in the input.
let dpi = 200;
let width = 17 * dpi;
let height = 11 * dpi;
const PDFDocument = require('pdfkit');
const doc = new PDFDocument({
size: [width, height]
});
doc.fontSize(14)
doc.pipe(fs.createWriteStream('/working/out4.pdf'));
console.log(data.Blocks.length);
data.Blocks.forEach((el) => {
if (el.Text !== undefined) {
doc.text(el.Text, width * el.Geometry.BoundingBox.Left, height * el.Geometry.BoundingBox.Top);
}
})
doc.end();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment