Skip to content

Instantly share code, notes, and snippets.

@shubhi1407
Created January 4, 2018 10:00
Show Gist options
  • Save shubhi1407/0ec93b5ec7eacfc5cfb7c7766c504da4 to your computer and use it in GitHub Desktop.
Save shubhi1407/0ec93b5ec7eacfc5cfb7c7766c504da4 to your computer and use it in GitHub Desktop.
#!/bin/bash
export GOOGLE_APPLICATION_CREDENTIALS=./google_bigquery_credentials.json
d=$(date --date='-2 month' +%F)
#Load questions data
node ./stack-express/load_questions.js $d
node ./stack-express/load_answers.js $d
#Create cluster
gcloud dataproc --region us-central1 clusters create apache-spark --subnet default --zone us-central1-f --master-machine-type custom-2-4096 --master-boot-disk-size 10 --num-workers 2 --worker-machine-type custom-2-4096 --worker-boot-disk-size 10 --project utility-ratio-190419
#Submit spark
gcloud dataproc jobs submit spark --cluster apache-spark --region us-central1 --class Demographic --jars /home/shubhangi140793/stack-sparkBackend/target/spark-stacknetwork-0.0.1-SNAPSHOT-jar-with-dependencies.jar -- posts_questions demographic_questions
gcloud dataproc jobs submit spark --cluster apache-spark --region us-central1 --class Demographic --jars /home/shubhangi140793/stack-sparkBackend/target/spark-stacknetwork-0.0.1-SNAPSHOT-jar-with-dependencies.jar -- posts_answers demographic_answers
#Shutdown cluster
gcloud dataproc clusters delete apache-spark --region us-central1 --quiet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment