Skip to content

Instantly share code, notes, and snippets.

@erikdubbelboer
Created July 1, 2017 04:05
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save erikdubbelboer/196f28e274ed7363858d8e6b8d4a5356 to your computer and use it in GitHub Desktop.
Save erikdubbelboer/196f28e274ed7363858d8e6b8d4a5356 to your computer and use it in GitHub Desktop.
Copy Druid jackson libs to Dataproc
#!/bin/bash
DRUID="0.10.0-rc3-SNAPSHOT"
HADOOP="2.7.3"
# Create a Dataproc hadoop initialization
mkdir -p initialization/jars/
cat > initialization/initialization.sh << stop
#!/bin/bash
mkdir /tmp/initialization
gsutil -m rsync -r -d gs://hadoop-eu-atomx/${DRUID}-${HADOOP}/ /tmp/initialization/
rm /usr/lib/hadoop-mapreduce/jackson-*-2.*.jar
cp /tmp/initialization/jars/* /usr/lib/hadoop-mapreduce/
exit 0
stop
rm -rf initialization/jars/*.jar
mkdir -p initialization/jars
cp druid-${DRUID}/lib/jackson-*-2.*.jar initialization/jars/
gsutil -m rsync -d -r initialization/ gs://hadoop-eu-atomx/${DRUID}-${HADOOP}/
echo "Don't forget to add the script as initialization action when creating the Dataproc cluster."
echo "gs://hadoop-eu-atomx/${DRUID}-${HADOOP}/initialization.sh"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment