In OCR, everything seems to be using tesseract which is a huge, complex library with lots of dependencies.
Here's an alternative toolchain that shows potential and is much quicker to use
brew install pdfimages
brew install gocr
brew install ocrad
pdfimages path_to_pdf.pdf /tmp/out