Skip to content

Instantly share code, notes, and snippets.

@dineshdharme
Last active November 17, 2023 05:03
Show Gist options
  • Save dineshdharme/8bd39cdbc35a09033b9ba2cfd1bdf146 to your computer and use it in GitHub Desktop.
Save dineshdharme/8bd39cdbc35a09033b9ba2cfd1bdf146 to your computer and use it in GitHub Desktop.
Getting (BLAS, LAPACK, ARPACK) libraries to work in Amazon EMR's Amazon Linux 2

Our AIM is to get native libraries of (BLAS, LAPACK, ARPACK) libraries working in Amazon EMR's Amazon Linux 2

TESTED On x86_64 architecture:

Amazon Linux release 2.0.20231020.1

STEPS WE WOULD FOLLOW :

  1. First we will normally compile the libraries (blas-3.0.3.jar, lapack-3.0.3.jar, arpack-3.0.3.jar) to make it works on our machine without any issue
  2. Then we will make some changes to the libraries placing markers at certain locations so that we would know we are using our compiled libraries when we include it in our project. After making those changes we would compile the library again.
  3. Structure our project to ensure that our compiled libraries are the one which will be used.

STEP 0:

Run the bootstrap-actions-script.sh , attached with this gist, on the master node of the EMR cluster. The script ensures we have all the development tools necessary to do the following steps. Always launch your EMR cluster with the bootstrap-actions-script.sh I have provided else even after compiling, it will complain "Failed to load Native implementation, etc"

STEP 1:

All the following steps are run on the hadoop home directory.

  1. Netlib Download Step

Download and unzip latest netlib release

Download the source code .zip or tar.gz file from here onto the EMR Master machine

https://github.com/luhenry/netlib/releases

Pick the latest release as it will have bug fixes

wget https://github.com/luhenry/netlib/archive/refs/tags/v3.0.3.tar.gz

Untar the library

tar -xvzf v3.0.3.tar.gz

  1. Maven Download Step

Download compatible maven and unzip it. Put the bin folder on the path

Download the latest compatible maven version to create jars for our particular Amazon Linux 2 AMI

wget https://dlcdn.apache.org/maven/maven-3/3.9.5/binaries/apache-maven-3.9.5-bin.tar.gz

tar -xvzf apache-maven-3.9.5-bin.tar.gz

Export the variables so that latest maven is present on the path

export M2_HOME=/home/hadoop/apache-maven-3.9.5
export PATH=${M2_HOME}/bin:${PATH}

Check if correct maven is picked up or not

mvn -version

  1. Set Java17 in the alternatives

set the alternatives such that java-17 is picked up, running commands one by one and making your choice

sudo alternatives --config java
sudo alternatives --config javac

In both of the above option select the option with : java-17-amazon-corretto.x86_64

i.e. /usr/lib/jvm/java-17-amazon-corretto.x86_64/bin/java

i.e. /usr/lib/jvm/java-17-amazon-corretto.x86_64/bin/javac

  1. Maven Ant variables required

Create the following symlinks. We are pointing the maven ANT build make variables to use correct gcc

sudo ln -s /usr/bin/gcc /usr/bin/x86_64-linux-gnu-gcc
sudo ln -s /usr/bin/gcc /usr/bin/aarch64-linux-gnu-gcc
  1. Modify Pom File

modify pom.xml in the netlib package, bump the jacoco version to 0.8.8 pom.xml located at /home/hadoop/netlib/netlib-3.0.3/pom.xml

vim /home/hadoop/netlib-3.0.3/pom.xml

Make sure the section of the pom file looks like below.

  <groupId>org.jacoco</groupId>
    <artifactId>jacoco-maven-plugin</artifactId>
    <version>0.8.8</version>
  1. Installing scala repl and sbt
wget https://www.scala-lang.org/files/archive/scala-2.12.10.rpm
sudo rpm -i scala-2.12.10.rpm

Installing sbt

curl -L https://www.scala-sbt.org/sbt-rpm.repo > sbt-rpm.repo
sudo mv sbt-rpm.repo /etc/yum.repos.d/
sudo yum install sbt -y
  1. Go to netlib-3.0.3 root folder

mvn -X clean package

  1. Get the test run sbt project which will help us verify on the master node itself if we have successfully compiled the library or not.

clone this sample project on the hadoop home directory:

https://github.com/dineshdharme/compile-netlib-emr

  1. Copy relevant files.

Copy jni.so to appropriate locations in our TEST PROJECT for initial testing

Copy blas.jar to appropriate locations in our TEST PROJECT for initial testing

cp /home/hadoop/netlib-3.0.3/blas/target/native/Linux-amd64/libnetlibblasjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/
cp /home/hadoop/netlib-3.0.3/lapack/target/native/Linux-amd64/libnetliblapackjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/
cp /home/hadoop/netlib-3.0.3/arpack/target/native/Linux-amd64/libnetlibarpackjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/

cp /home/hadoop/netlib-3.0.3/blas/target/blas-3.0.3.jar /home/hadoop/compile-netlib-emr/staginglib/
cp /home/hadoop/netlib-3.0.3/lapack/target/lapack-3.0.3.jar /home/hadoop/compile-netlib-emr/staginglib/
cp /home/hadoop/netlib-3.0.3/arpack/target/arpack-3.0.3.jar /home/hadoop/compile-netlib-emr/staginglib/

  1. Change build.sbt so that version on the 3 jars are 3.0.3

  2. Now run the command :

sbt assembly && /usr/bin/spark-submit \
--class org.test.netlib.EntryPoint \
target/scala-2.12/compile-netlib-emr-assembly-0.1.0.jar

If you get outputs like below, that means you are loading Native libraries:

BLAS dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.blas.JNIBLAS@7ac2e39b
LAPACK dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.lapack.JNILAPACK@63fdab07
ARPACK dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.arpack.JNIARPACK@5c530d1e

STEP 2:

MODIFYING INTERNAL FILES OF netlib-3.0.3 PROJECT SO THAT WE ARE ABLE TO INCLUDE THE FINAL GENERATED ARTIFACTS IN OUR UBER JAR.

Make sure the files look like these at the relevant sections.

Use Vim to change the files.

vim /home/hadoop/netlib-3.0.3/pom.xml

   <groupId>dev.ludovic.netlib</groupId>
     <artifactId>parent</artifactId>
      <version>6.6.6</version>
    <packaging>pom</packaging>

vim /home/hadoop/netlib-3.0.3/blas/pom.xml

  <parent>
    <groupId>dev.ludovic.netlib</groupId>
    <artifactId>parent</artifactId>
    <version>6.6.6</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

  <groupId>dev.ludovic.netlib</groupId>
  <artifactId>blas</artifactId>
  <version>6.6.6</version>
  <packaging>jar</packaging>

vim /home/hadoop/netlib-3.0.3/lapack/pom.xml

  <parent>
    <groupId>dev.ludovic.netlib</groupId>
    <artifactId>parent</artifactId>
    <version>6.6.6</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

  <groupId>dev.ludovic.netlib</groupId>
  <artifactId>lapack</artifactId>
  <version>6.6.6</version>
  <packaging>jar</packaging>

vim /home/hadoop/netlib-3.0.3/arpack/pom.xml

  <parent>
    <groupId>dev.ludovic.netlib</groupId>
    <artifactId>parent</artifactId>
    <version>6.6.6</version>
    <relativePath>../pom.xml</relativePath>
  </parent>

  <groupId>dev.ludovic.netlib</groupId>
  <artifactId>lapack</artifactId>
  <version>6.6.6</version>
  <packaging>jar</packaging>

IN THE FOLLOW JAVA FILES CHANGES, KEEP THREE THINGS IN MIND.

BELOW IS MARKER LINE TO KNOW THAT IT IS OUR COMPILED JAR THAT WILL BE USED.

System.out.println("@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@");

.getClassLoader() has been removed

try (InputStream resource = this.getClass().getResourceAsStream(

PATH IS CHANGED TO THIS NEW PATH

String.format("/native/%s-%s/libnetlibblasjni.so", osName, osArch))) {

vim /home/hadoop/netlib-3.0.3/blas/src/main/java/dev/ludovic/netlib/blas/JNIBLAS.java

System.out.println("@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@");
Path temp;
try (InputStream resource = this.getClass().getResourceAsStream(
        String.format("/native/%s-%s/libnetlibblasjni.so", osName, osArch))) {
  assert resource != null;
  Files.copy(resource, temp = Files.createTempFile("libnetlibblasjni.so", "",
                                PosixFilePermissions.asFileAttribute(PosixFilePermissions.fromString("rwxr-x---"))),
              StandardCopyOption.REPLACE_EXISTING);
  temp.toFile().deleteOnExit();
} catch (IOException e) {
  throw new RuntimeException("Unable to load native implementation", e);
}

vim /home/hadoop/netlib-3.0.3/lapack/src/main/java/dev/ludovic/netlib/lapack/JNILAPACK.java

System.out.println("@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@");
Path temp;
try (InputStream resource = this.getClass().getResourceAsStream(
        String.format("/native/%s-%s/libnetliblapackjni.so", osName, osArch))) {
  assert resource != null;
  Files.copy(resource, temp = Files.createTempFile("libnetliblapackjni.so", "",
                                PosixFilePermissions.asFileAttribute(PosixFilePermissions.fromString("rwxr-x---"))),
              StandardCopyOption.REPLACE_EXISTING);
  temp.toFile().deleteOnExit();
} catch (IOException e) {
  throw new RuntimeException("Unable to load native implementation", e);
}

vim /home/hadoop/netlib-3.0.3/arpack/src/main/java/dev/ludovic/netlib/arpack/JNIARPACK.java

System.out.println("@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@");
Path temp;
try (InputStream resource = this.getClass().getResourceAsStream(
        String.format("/native/%s-%s/libnetlibarpackjni.so", osName, osArch))) {
  assert resource != null;
  Files.copy(resource, temp = Files.createTempFile("libnetlibarpackjni.so", "",
                                PosixFilePermissions.asFileAttribute(PosixFilePermissions.fromString("rwxr-x---"))),
              StandardCopyOption.REPLACE_EXISTING);
  temp.toFile().deleteOnExit();
} catch (IOException e) {
  throw new RuntimeException("Unable to load native implementation", e);
}

Skip tests since those are the ones which are causing error as we have a different way of loading jni.so

https://stackoverflow.com/questions/6612344/prevent-unit-tests-but-allow-integration-tests-in-maven

skip test : declare properties and use them.

Make changes to the following pom file

vim /home/hadoop/netlib-3.0.3/pom.xml

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <javac.target>8</javac.target>
    <jvm.modules></jvm.modules>
    <argLine></argLine>
    <automatic.module.name>dev.ludovic.netlib</automatic.module.name>
    <skipTests>true</skipTests>
    <skipITs>${skipTests}</skipITs>
    <skipUTs>${skipTests}</skipUTs>
  </properties>


      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <configuration>
                <skipTests>${skipUTs}</skipTests>
        </configuration>
      </plugin>

Then RUN

mvn -X clean package -DskipTest=true

First the mvn package build was to test the jar and jni.so files are being built properly. This second mvn package after changes is to built it in new way so that we can test it locally and include it in our uber jar. Note the version of these new jars is 6.6.6

Copy jni.so to appropriate locations in our TEST PROJECT for final testing

Copy blas.jar to appropriate locations in our TEST PROJECT for final testing

cp /home/hadoop/netlib-3.0.3/blas/target/native/Linux-amd64/libnetlibblasjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/
cp /home/hadoop/netlib-3.0.3/lapack/target/native/Linux-amd64/libnetliblapackjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/
cp /home/hadoop/netlib-3.0.3/arpack/target/native/Linux-amd64/libnetlibarpackjni.so /home/hadoop/compile-netlib-emr/src/main/resources/native/Linux-amd64/

cp /home/hadoop/netlib-3.0.3/blas/target/blas-6.6.6.jar /home/hadoop/compile-netlib-emr/staginglib/
cp /home/hadoop/netlib-3.0.3/lapack/target/lapack-6.6.6.jar /home/hadoop/compile-netlib-emr/staginglib/
cp /home/hadoop/netlib-3.0.3/arpack/target/arpack-6.6.6.jar /home/hadoop/compile-netlib-emr/staginglib/

Test these newly copied jars.

sbt assembly && /usr/bin/spark-submit \
--class org.test.netlib.EntryPoint \
target/scala-2.12/compile-netlib-emr-assembly-0.1.0.jar

It will print out lines like.

Behind the flag = true
@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@
BLAS dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.blas.JNIBLAS@7ac2e39b
@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@
LAPACK dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.lapack.JNILAPACK@63fdab07
@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@
ARPACK dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.arpack.JNIARPACK@5c530d1e

If you get above output, this implies our complied jars with the marker lines are being used.

STEP 3 :

How to structure your projects so that our compiled jars work properly?

If you don't want to repeat these steps, you can download the jars and jni files I have compiled.

Download from here : https://github.com/dineshdharme/compile-netlib-emr/blob/master/netlib-jni-dependency/emr-netlib-compiled-artifacts-v5.0.zip

Unzipping the file, structure will look like this :


emr-netlib-compiled-artifacts/
|-- jars
|   |-- arpack-6.6.6.jar
|   |-- blas-6.6.6.jar
|   |-- lapack-6.6.6.jar
|
|-- libs
    |-- libnetlibarpackjni.so
    |-- libnetlibblasjni.so
    |-- libnetliblapackjni.so
    

Copy the jni.so files to the resource directory of your uber jar project. It should look like below.

resources/
└── native
    └── Linux-amd64
        ├── can_access_this.txt
        ├── libnetlibarpackjni.so
        ├── libnetlibblasjni.so
        └── libnetliblapackjni.so

Copy the jars to a staginglib directory of uber jar project.

staginglib/
├── arpack-6.6.6.jar
├── blas-6.6.6.jar
└── lapack-6.6.6.jar

Add the following lines to your build.sbt project

lazy val compiledJarsDirectory = settingKey[File]("The directory for compiled jars")
lazy val blasJar = settingKey[File]("The directory for compiled jars")
lazy val arpackJar = settingKey[File]("The directory for compiled jars")
lazy val lapackJar = settingKey[File]("The directory for compiled jars")

ThisBuild / compiledJarsDirectory := baseDirectory.value / "staginglib"
ThisBuild / blasJar  := (ThisBuild / compiledJarsDirectory ).value / "blas-6.6.6.jar"
ThisBuild / arpackJar  := (ThisBuild / compiledJarsDirectory ).value / "arpack-6.6.6.jar"
ThisBuild / lapackJar  := (ThisBuild / compiledJarsDirectory ).value / "lapack-6.6.6.jar"


libraryDependencies ++= Seq(
  "dev.ludovic.netlib" % "blas" % "6.6.6" from blasJar.value.toURI.toString,
  "dev.ludovic.netlib" % "arpack" % "6.6.6" from arpackJar.value.toURI.toString,
  "dev.ludovic.netlib" % "lapack" % "6.6.6" from lapackJar.value.toURI.toString,
  "org.scalanlp" %% "breeze" % "2.1.0",
  "org.scalanlp" %% "breeze-natives" % "2.1.0",

)

Compile your jar and run it on EMR. Check for logs so that our marker lines are being printed.

@@@@@@@@@@@@@>>>>WE HAVE MADE IT ALL THE WAY<<<<<@@@@@@@@@@@@@
BLAS dev.ludovic.netlib.blas.NativeBLAS: dev.ludovic.netlib.blas.JNIBLAS@7ac2e39b

IMPORTANT NOTE : Make sure to run the bootstrap-actions-script.sh before running the project. The script removes the offending jars which interfere with the loading of our jars as spark's internal classpath has spark's jar listed first than user's classpath jars.

IMPORTANT NOTE 2:

Make sure to call your uber jar with these spark --conf settings.

As detailed in the readme here.

https://github.com/luhenry/netlib/

--conf "spark.executor.extraJavaOptions=-Ddev.ludovic.netlib.blas.nativeLibPath=/opt/openblas/lib/libopenblas_threaded.so.0 -Ddev.ludovic.netlib.lapack.nativeLibPath=/opt/openblas/lib/libopenblas_threaded.so.0" \
--conf "spark.driver.extraJavaOptions=-Ddev.ludovic.netlib.blas.nativeLibPath=/opt/openblas/lib/libopenblas_threaded.so.0 -Ddev.ludovic.netlib.lapack.nativeLibPath=/opt/openblas/lib/libopenblas_threaded.so.0" \
#!/bin/bash
set -e
# Any subsequent(*) commands which fail will cause the shell script to exit immediately
# OS : Amazon Linux 2 version
# Amazon Linux release
# 2.0.20231020.1
################### IMPORTANT LINKS ####################
## Instructions copied from this location.
#### https://github.com/bgeneto/build-install-compile-openblas#32-build-and-install-openblas-nonthreaded-version
sudo amazon-linux-extras install epel -y
sudo yum update -y
sudo yum groupinstall 'Development Tools' -y
sudo yum install gcc-c++ gcc-gfortran git -y
sudo yum install arpack openblas openblas-devel -y
sudo yum install java-17-amazon-corretto.x86_64 java-17-amazon-corretto-devel.x86_64 -y
## To REMOVE culprit jars so that our JARS are used instead.
#### VERY IMPORTANT POINT : WE have to move BOTH (BLAS,LAPACK,ARPACK) and (breeze, breeze-macros) group of jars.
move_jars() {
if [ "$1" = true ]; then
echo "Moving jars..."
mkdir -p ~/spark-culprit-jars
sudo mv /usr/lib/spark/jars/breeze-macros_2.12-2.1.0.jar ~/spark-culprit-jars
sudo mv /usr/lib/spark/jars/breeze_2.12-2.1.0.jar ~/spark-culprit-jars
sudo mv /usr/lib/spark/jars/blas-3.0.3.jar ~/spark-culprit-jars
sudo mv /usr/lib/spark/jars/lapack-3.0.3.jar ~/spark-culprit-jars
sudo mv /usr/lib/spark/jars/arpack-3.0.3.jar ~/spark-culprit-jars
else
echo "Skipping jar movement."
fi
}
## To test the speedup, make this argument false as it will cause the internal function to used
move_jars true
compile_openblas_threaded_version() {
if [ "$1" = true ]; then
echo "Compiling OpenBlas Threaded version ..."
## Instructions copied from this location.
#### https://github.com/bgeneto/build-install-compile-openblas#32-build-and-install-openblas-nonthreaded-version
OPENBLAS_DIR=/opt/openblas
sudo mkdir $OPENBLAS_DIR
cd $HOME
git clone https://github.com/xianyi/OpenBLAS
cd $HOME/OpenBLAS
export USE_THREAD=1
export NUM_THREADS=64
export DYNAMIC_ARCH=0
export NO_WARMUP=1
export BUILD_RELAPACK=0
export COMMON_OPT="-O2 -march=native"
export CFLAGS="-O2 -march=native"
export FCOMMON_OPT="-O2 -march=native"
export FCFLAGS="-O2 -march=native"
make -j DYNAMIC_ARCH=0 CC=gcc FC=gfortran HOSTCC=gcc BINARY=64 INTERFACE=64 LIBNAMESUFFIX=threaded
sudo make PREFIX=$OPENBLAS_DIR LIBNAMESUFFIX=threaded install
make -j lapack-test
cd ./lapack-netlib; python3 ./lapack_testing.py -r -b TESTING
cd $HOME/OpenBLAS
sudo make install LIBNAMESUFFIX=threaded
export C_INCLUDE_PATH=$C_INCLUDE_PATH:/opt/OpenBLAS/include
export CPATH=$CPATH:/opt/OpenBLAS/include
export LIBRARY_PATH=$LIBRARY_PATH:/opt/OpenBLAS/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/OpenBLAS/lib
else
echo "Skipping compiling of openblas threaded version"
fi
}
## This function compiles threaded version of OpenBlas. IT IS FASTER.
compile_openblas_threaded_version true
# Install mkl which can be used later for BLAS and SO ON.
sudo yum-config-manager --add-repo https://yum.repos.intel.com/mkl/setup/intel-mkl.repo -y
sudo rpm --import https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
sudo yum update -y
## Install 2019.5 version if we are using SystemDS
#### https://systemds.apache.org/
#### https://apache.github.io/systemds/site/builtins-reference.html#lm-function
###### https://apache.github.io/systemds/site/run
######## To use the MKL acceleration download and install the latest supported MKL library (<=2019.5) from
# sudo yum install intel-mkl-2019.5-075.x86_64 -y
## If we are just MKL's BLAS LAPACK AND ARPACK implementation then using the latest is always the best option.
### sudo yum install intel-mkl -y
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment