Skip to content

Instantly share code, notes, and snippets.

@mdellabitta
Last active December 31, 2021 02:16
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mdellabitta/2fb9a8fa3a4d429dc2032d7e55001ec0 to your computer and use it in GitHub Desktop.
Save mdellabitta/2fb9a8fa3a4d429dc2032d7e55001ec0 to your computer and use it in GitHub Desktop.
How to build hadoop native dylibs for macOS for Spark

How to build hadoop native dylibs for macOS for Spark

This worked for me for the versions of Hadoop and Spark that I needed to use. Probably will work for other versions as well, until macOS takes the next step further from the light...

  1. Install dependencies with homebrew:

    • snappy
    • zlib
    • zstd
    • bzip2
  2. Install SDKMAN.

  3. sdk install 8.0.292.hs-adpt and make default

  4. sdk install spark 3.1.2 and make default

  5. sdk install hadoop 3.2.2 and make default

  6. sdk install maven

  7. Clone hadoop project (git@github.com:apache/hadoop.git) and checkout branch-3.2.2

  8. cd into hadoop/hadoop-common-project/hadoop-common

  9. Add <ZLIB_LIBRARY>${zlib.lib}</ZLIB_LIBRARY> under cmake-compile profile vars in pom.xml. Fixes cmake problem.

  10. Build with:

    mvn clean package -Pdist,native -DskipTests -Dmaven.javadoc.skip -Dzlib.lib=/usr/local/opt/zlib/lib/libz.dylib -Drequire.bzip2=false -Drequire.snappy=true -Drequire.zstd=true -Dsnappy.prefix=/usr/local/opt/snappy/ -Dsnappy.lib=/usr/local/opt/snappy/lib/libsnappy.1.dylib -Dsnappy.include=/usr/local/opt/snappy/include/ -Dzstd.prefix=/usr/local/opt/zstd/ -Dzstd.lib=/usr/local/opt/zstd/lib/libzstd.1.dylib -Dzstd.include=/usr/local/opt/zstd/include/

NOTE: This means bzip2 isn't native and openssl isn't included. If those are problems, you're probably stuck figuring out how to do this without System Integrity Protection because you're going to need to mess with library paths.

  1. Copy target/native/target/usr/local/lib/libhadoop.1.0.0.dylib to ~/.sdkman/candidates/hadoop/3.2.2/lib/native and symlink libhadoop.dylib to it in the same folder

  2. Go to ~/.sdkman/candidates/java/current/bin/ and link libsnappy.1.dylib and libzstd.1.dylib from /usr/local/lib/ into it. You'll have to do this for every java version you use.

  3. run hadoop checknative -a and make sure there are no errors and that it says true for the codecs.

  4. I've set these env vars. Definitely only some of them do anything!

export HADOOP_HOME=$HOME/.sdkman/candidates/hadoop/current
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment