This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# catch ctrl+c handler | |
trap ctrl_c_cleanup INT | |
function ctrl_c_cleanup() { | |
echo "** Interrupt handler caught" | |
rm -rf spot_prices_*.json | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# catch ctrl+c handler | |
trap ctrl_c_cleanup INT | |
function ctrl_c_cleanup() { | |
echo "** Interrupt handler caught" | |
rm -rf $job_file | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
IFS=$'\n' # make newlines the only separator | |
while getopts ":o" opt; do | |
case $opt in | |
o) | |
ondemand=true | |
echo -e "Deploying on-demand cluster for mwc\n" | |
;; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#/bin/bash | |
# Set Params | |
k=YOUR_AWS_KEYS | |
s=YOU_AWS_SECRETE | |
r=YOUR_REGION | |
# Assign EIP ID | |
eip_id=eipalloc-XXXXXXX | |
# Install awscli |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
IFS=$'\n' # make newlines the only separator | |
while getopts ":p" opt; do | |
case $opt in | |
p) | |
print_versions=true | |
echo -e "Printing the spark verions and node types supported\n" | |
;; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
usage="Add jars to the input arguments to specify the spark job. -h list the supported spark versions" | |
RUNTIME_VERSION="3.2.x-scala2.11" | |
NODE_TYPE="r3.xlarge" | |
while getopts ':hs:' option; do | |
case "$option" in | |
h) echo "$usage" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
usage="Add jars to the input arguments to specify the spark job. -h list the supported spark versions" | |
RUNTIME_VERSION="3.2.x-scala2.11" | |
NODE_TYPE="r3.xlarge" | |
while getopts ':hs:' option; do | |
case "$option" in | |
h) echo "$usage" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.databricks.example.pivot | |
/** | |
This code allows a user to add vectors together for common keys. | |
The code in the comments show you how to register the scala UDAF to be called from pyspark. | |
The UDAF can only be called from a SQL expression (aka spark.sql() or df.expr() ) | |
**/ | |
/** | |
# Python code to register a scala UDAF |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json, pprint, requests, datetime | |
################################################################ | |
## Replace the token variable and environment url below | |
################################################################ | |
# Helper to pretty print json | |
def pprint_j(i): | |
print json.dumps(i, indent=4, sort_keys=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
##### READ SPARK DATAFRAME | |
df = spark.read.option("header", "true").option("inferSchema", "true").csv(fname) | |
# store the schema from the CSV w/ the header in the first file, and infer the types for the columns | |
df_schema = df.schema | |
##### SAVE JSON SCHEMA INTO S3 / BLOB STORAGE | |
# save the schema to load from the streaming job, which we will load during the next job | |
dbutils.fs.rm("/home/mwc/airline_schema.json", True) | |
with open("/dbfs/home/mwc/airline_schema.json", "w") as f: |
OlderNewer