Skip to content

Instantly share code, notes, and snippets.

@nithu0115
Last active February 14, 2019 23:37
Show Gist options
  • Save nithu0115/bfaaa1d1f3069244ddce7c74de09b669 to your computer and use it in GitHub Desktop.
Save nithu0115/bfaaa1d1f3069244ddce7c74de09b669 to your computer and use it in GitHub Desktop.
Spark with EKS

As of Feburary-14-2019, Spark does not have integration with EKS as spark binary would require to use aws-iam-authentator to fetch credential to authenticate to EKS cluster which will be integrated to kubernetes-client 4.1.2 release Ref: fabric8io/kubernetes-client#1358

In order to integrate spark binary with EKS, we will have to do a custom build with fabric 4.1-SHAPSHOT version.

Prerequisites:

Java8

Apache-maven3.x.x and above

Installation:

1) git clone https://github.com/fabric8io/kubernetes-client.git
2) cd kubernetes-client/
3) mvn clean install -DskipTests (Maven central repository has fabric8io/kubernetes-client version 4.1.1 which has some issue with EKS aws-iam-authenticator)
4) git clone https://github.com/apache/spark.git
5) cd spark
6) vi resource-managers/kubernetes/core/pom.xml and change kubernetes.client.version from 4.1.0 to 4.1-SNAPSHOT and save
7) Now build a new binary from the above source 
./dev/make-distribution.sh --name custom-spark --tgz -Phadoop-2.7 -Pkubernetes
8) This would create a new tgz file  spark-3.0.0-SNAPSHOT-bin-custom-spark.tgz
9) Untar: tar -xzvf  spark-3.0.0-SNAPSHOT-bin-custom-spark.tgz
10) export MASTER=<EKS server endpoint>
11) Test by running this command:  . spark-3.0.0-SNAPSHOT-bin-custom-spark/bin/spark-submit   --master k8s://${MASTER}   --deploy-mode cluster   --class SimpleApp   --name SimpleApp  --conf spark.executor.instances=16   --conf spark.kubernetes.container.image=jamesrcounts/hello-kubernetes:latest   local:///opt/spark/jars/hello-kubernetes_2.11-0.1.jar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment