Skip to content

Instantly share code, notes, and snippets.

@codesankalp
Created May 31, 2023 04:29
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save codesankalp/c4d8e18421bf112a70406241f6d91ee9 to your computer and use it in GitHub Desktop.
Save codesankalp/c4d8e18421bf112a70406241f6d91ee9 to your computer and use it in GitHub Desktop.
Dockerfile for building tesseract on amazon linux 2
FROM docker.io/library/amazonlinux:latest
RUN yum -y update
RUN yum -y upgrade
RUN yum install clang -y && \
yum install libpng-devel libtiff-devel zlib-devel libwebp-devel libjpeg-turbo-devel wget tar gzip -y && \
wget https://github.com/DanBloomberg/leptonica/releases/download/1.75.1/leptonica-1.75.1.tar.gz && \
tar -zxvf leptonica-1.75.1.tar.gz && \
cd leptonica-1.75.1 && \
./configure && \
make && \
make install
RUN cd ~ && \
yum install git-core libtool pkgconfig -y && \
git clone --depth 1 https://github.com/tesseract-ocr/tesseract.git tesseract-ocr && \
cd tesseract-ocr && \
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig && \
./autogen.sh && \
./configure && \
make && \
make install
RUN cd /usr/local/share/tessdata && \
wget https://github.com/tesseract-ocr/tessdata/raw/main/osd.traineddata && \
wget https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata && \
wget https://github.com/tesseract-ocr/tessdata/raw/main/hin.traineddata
ENV TESSDATA_PREFIX=/usr/local/share/tessdata
@codesankalp
Copy link
Author

docker run -it codesankalp/amazon-linux-tesseract:latest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment