Skip to content

Instantly share code, notes, and snippets.

@dpanigra
Created November 18, 2020 20:53
############### 13 Build Spark images and push them to Google Container Registry ######################
# deploy spark images
# create the base spark image
# this will be our golden image
cd $HOME/Downloads/spark_dir/spark-3.0.1-bin-hadoop2.7/
./bin/docker-image-tool.sh -r gcr.io/$PROJECT_ID -t v3.0.1 build
./bin/docker-image-tool.sh -r gcr.io/$PROJECT_ID -t v3.0.1 push
# create a Spark image with Google Cloud Storage Connector
mkdir -p $HOME/Downloads/spark_img/gcs_jars
cd $HOME/Downloads/spark_img/gcs_jars
curl -fLO https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-hadoop2-latest.jar
curl -fLO https://repo1.maven.org/maven2/com/google/guava/guava/29.0-jre/guava-29.0-jre.jar
cd $HOME/Downloads/spark_img
cat > Dockerfile << EOF
FROM gcr.io/$PROJECT_ID/spark:v3.0.1
ADD gcs_jars/guava-29.0-jre.jar $SPARK_HOME/jars
ADD gcs_jars/gcs-connector-hadoop2-latest.jar $SPARK_HOME/jars
RUN ls -ltrah /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]
EOF
#push the Spark images with Google Cloud Storage connector image to the registry
docker build -t gcr.io/$PROJECT_ID/spark_gcs:v3.0.1 .
docker push gcr.io/$PROJECT_ID/spark_gcs:v3.0.1
# end of deploy spark images
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment