Skip to content

Instantly share code, notes, and snippets.

@montali
Created September 20, 2021 09:51
Show Gist options
  • Save montali/f3c195d9b3b36dc578095c842df9f91f to your computer and use it in GitHub Desktop.
Save montali/f3c195d9b3b36dc578095c842df9f91f to your computer and use it in GitHub Desktop.
Running Apache Spark on DigitalOcean
  • Running Apache Spark on DigitalOcean

  • To run Apache Spark on DO, first create a Ubuntu LTS droplet.

  • Install Python and Java:

    sudo apt update && sudo apt install -y openjdk-8-jdk-headless python
    
  • Download the Spark package:

     wget https://downloads.apache.org/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz
    
  • Unpack it and move it to /opt

    tar xvf spark-3.1.2-bin-hadoop3.2.tgz && mv spark-3.1.2-bin-hadoop3.2 /opt/spark
    
  • Insert the variables in profile:

    echo "export SPARK_HOME=/opt/spark" >> ~/.profile
    echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
    echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.profile
    

EVOMAN

To then use PySpark with HyperOpt and EVOMAN, we install pip and the libraries:

pip install numpy pyspark pygame

Except for hyperopt, which we'll have to install by source code (a bugfix has not been released yet)

git clone https://github.com/hyperopt/hyperopt.git
cd hyperopt/
pip install .
cd ..

Finally, suppress the ALSA warnings:

  • Create a new file /etc/asound.conf and insert the following:

    pcm.!default {
        type plug
        slave.pcm "null"
    }
    

You're ready to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment