Skip to content

Instantly share code, notes, and snippets.

@ankurdave
Created September 19, 2014 22:43
Show Gist options
  • Save ankurdave/21bace6c26c260c7a531 to your computer and use it in GitHub Desktop.
Save ankurdave/21bace6c26c260c7a531 to your computer and use it in GitHub Desktop.
Spark v2 configuration
#!/usr/bin/env bash
SPARK_JAVA_OPTS+=" -Dspark.local.dir=/mnt/spark,/mnt2/spark"
export SPARK_JAVA_OPTS
export SPARK_MEM=58g
# Standalone cluster options
export SPARK_MASTER_OPTS=""
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_CORES=8
export HADOOP_HOME="/root/ephemeral-hdfs"
export SPARK_MASTER_IP=`wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname`
export MASTER=`cat /root/spark-ec2/cluster-url`
export SPARK_LIBRARY_PATH="/root/ephemeral-hdfs/lib/native/"
export SPARK_CLASSPATH="/root/ephemeral-hdfs/conf"
# Bind Spark's web UIs to this machine's public EC2 hostname:
export SPARK_PUBLIC_DNS=`wget -q -O - http://169.254.169.254/latest/meta-data/public-hostname`
# Set a high ulimit for large shuffles
ulimit -n 1000000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment