Skip to content

Instantly share code, notes, and snippets.

@bskaggs
Last active May 7, 2024 07:24
Show Gist options
  • Save bskaggs/fc3c8d0d553be54e2645616236fdc8c6 to your computer and use it in GitHub Desktop.
Save bskaggs/fc3c8d0d553be54e2645616236fdc8c6 to your computer and use it in GitHub Desktop.
Install pyarrow on alpine in docker
FROM python:3.7-alpine3.8
RUN apk add --no-cache \
build-base \
cmake \
bash \
jemalloc-dev \
boost-dev \
autoconf \
zlib-dev \
flex \
bison
RUN pip install --no-cache-dir six pytest numpy cython
RUN pip install --no-cache-dir pandas
ARG ARROW_VERSION=0.12.0
ARG ARROW_SHA1=2ede75769e12df972f0acdfddd53ab15d11e0ac2
ARG ARROW_BUILD_TYPE=release
ENV ARROW_HOME=/usr/local \
PARQUET_HOME=/usr/local
#Download and build apache-arrow
RUN mkdir /arrow \
&& apk add --no-cache curl \
&& curl -o /tmp/apache-arrow.tar.gz -SL https://github.com/apache/arrow/archive/apache-arrow-${ARROW_VERSION}.tar.gz \
&& echo "$ARROW_SHA1 *apache-arrow.tar.gz" | sha1sum /tmp/apache-arrow.tar.gz \
&& tar -xvf /tmp/apache-arrow.tar.gz -C /arrow --strip-components 1 \
&& mkdir -p /arrow/cpp/build \
&& cd /arrow/cpp/build \
&& cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DARROW_PARQUET=on \
-DARROW_PYTHON=on \
-DARROW_PLASMA=on \
-DARROW_BUILD_TESTS=OFF \
.. \
&& make -j$(nproc) \
&& make install \
&& cd /arrow/python \
&& python setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet \
&& python setup.py install \
&& rm -rf /arrow /tmp/apache-arrow.tar.gz
@notariuss
Copy link

notariuss commented Oct 19, 2022

FROM python:3.7.15-alpine3.16

RUN apk add --no-cache bash \
    postgresql-dev \
    gettext \
    gcc \
    musl-dev \
    make \
    cmake \
    g++ \
    git \
    boost-dev \
    flex \
    bison \
    zlib-dev \
    autoconf \
    build-base

WORKDIR /code

COPY . /code

RUN pip install --no-cache --upgrade pip wheel
RUN pip install --no-cache -r requirements.txt

RUN apk del git make cmake g++
      -- Could NOT find Arrow (missing: Arrow_DIR)
      -- Checking for module 'arrow'
      --   Package 'arrow', required by 'virtual:world', not found
      CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find Arrow (missing: ARROW_INCLUDE_DIR ARROW_LIB_DIR
        ARROW_FULL_SO_VERSION ARROW_SO_VERSION)
      Call Stack (most recent call first):
        /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
        cmake_modules/FindArrow.cmake:419 (find_package_handle_standard_args)
        cmake_modules/FindArrowPython.cmake:46 (find_package)
        CMakeLists.txt:218 (find_package)

@arthurdevries
Copy link

I can't make this work either. If someone has knowledge about what the underlying problem is I will gladly put in some time and effort and try to make this work. Unfortunately, my knowledge about this thus far is quite limited. I would really like to work with the Alpine base image as it is a safe and small starting point. I am using python:3.11-alpine as a base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment