Skip to content

Instantly share code, notes, and snippets.

@cerisier
Last active July 2, 2021 21:56
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cerisier/118c06d1a0147d1fb898218b57ba82a3 to your computer and use it in GitHub Desktop.
Save cerisier/118c06d1a0147d1fb898218b57ba82a3 to your computer and use it in GitHub Desktop.
Dataproc initialization action script for installing python3
#!/bin/bash
# from https://gist.githubusercontent.com/nehalecky/9258c01fb2077f51545a/raw/789f08141dc681cf1ad5da05455c2cd01d1649e8/install-py3-dataproc.sh
apt-get -y install python3
echo "export PYSPARK_PYTHON=python3" | tee -a /etc/profile.d/spark_config.sh /etc/*bashrc /usr/lib/spark/conf/spark-env.sh
echo "Adding PYTHONHASHSEED=0 to profiles and spark-defaults.conf..."
echo "export PYTHONHASHSEED=0" | tee -a /etc/profile.d/spark_config.sh /etc/*bashrc /usr/lib/spark/conf/spark-env.sh
echo "spark.executorEnv.PYTHONHASHSEED=0" >> /etc/spark/conf/spark-defaults.conf
@avloss
Copy link

avloss commented Apr 2, 2020

if someone stumbles upon this in 2020 -- you need to specify latest image version --image-version 1.5

@CalenDario13
Copy link

if someone stumbles upon this in 2020 -- you need to specify latest image version --image-version 1.5

So If I am using image-version 1.3 and trying to use thi script, it won't work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment