Skip to content

Instantly share code, notes, and snippets.

@bmorrisondev
Created September 4, 2021 23:00
Show Gist options
  • Save bmorrisondev/846f4fcd6746cdf5510e9450e55907c0 to your computer and use it in GitHub Desktop.
Save bmorrisondev/846f4fcd6746cdf5510e9450e55907c0 to your computer and use it in GitHub Desktop.
Ocr stuffz
const axios = require('axios')
const fs = require('fs')
const { PDFImage } = require('pdf-image')
const { createWorker } = require('tesseract.js')
async function ocrStuffz(fileName) {
let pdfImage = new PDFImage(fileName, { combinedImage: true })
let convertedImage = await pdfImage.convertFile()
const worker = createWorker({
logger: m => console.log(m)
});
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(convertedImage);
await worker.terminate();
return text;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment