Skip to content

Instantly share code, notes, and snippets.

@AvnerCohen
Created July 26, 2018 11:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save AvnerCohen/a78c32b19934ad9f06c7f55170e1fc8b to your computer and use it in GitHub Desktop.
Save AvnerCohen/a78c32b19934ad9f06c7f55170e1fc8b to your computer and use it in GitHub Desktop.
Install popller and pdfparser on Centos 6 / Amazon AMI
#!/bin/bash -e
set -x
# Poppler is used by resume parser for a faster PDF data extraction.
# Deploy is a bit man ual and complicated because of the C libs involved
# the fact no pip is used (anaconda and not virtualenev) and that the
# Cmake version eeds to be be udpated as part of it.
CMAKE_VERSION='3.6.2'
POPPLER_VERSION="poppler-0.67.0"
TARGET_DIR='/opt/poppler'
if [ "$(id -u)" -ne 0 ]; then
echo 'This script must be run by root' >&2
exit 1
fi
mkdir $TARGET_DIR
cd $TARGET_DIR
echo "#### START CMAKE UPDATE ####"
yum remove cmake -y
mkdir cmake
cd cmake/
wget https://cmake.org/files/v3.6/cmake-$CMAKE_VERSION.tar.gz
tar -zxvf cmake-$CMAKE_VERSION.tar.gz
cd cmake-$CMAKE_VERSION
./bootstrap --prefix=/usr/local
make
make install
echo "#### DONE CMAKE UPDATE ####"
cd $TARGET_DIR
echo "#### START DEPS INSTALL CMAKE UPDATE ####"
yum install -y openjpeg2-devel libjpeg-turbo-devel freetype-devel fontconfig-devel
echo "#### DONE DEPS INSTALL CMAKE UPDATE ####"
echo "#### START PDFPARSER CLONE ####"
git clone https://github.com/izderadicka/pdfparser.git
cd pdfparser
echo "#### END PDFPARSER CLONE ####"
echo "#### START POPPLER BUILD ####"
git clone https://anongit.freedesktop.org/git/poppler/poppler.git poppler_src
cd poppler_src
cmake -DENABLE_SPLASH=OFF -DENABLE_UTILS=OFF -DENABLE_LIBOPENJPEG=none .
make
echo "#### END POPPLER BUILD ####"
echo "#### START PDFPARSER BUILD ####"
cp libpoppler.so.?? ../pdfparser/
cp cpp/libpoppler-cpp.so.? ../pdfparser
cd $TARGET_DIR/pdfparser
pip install Cython --install-option="--no-cython-compile"
POPPLER_ROOT=poppler_src python setup.py install
echo "#### END PDFPARSER BUILD ####"
echo "#### END ####"
@look4regev
Copy link

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment