Skip to content

Instantly share code, notes, and snippets.

View claudinei-daitx's full-sized avatar

Claudinei Daitx claudinei-daitx

  • Toronto, Ontario
View GitHub Profile
@claudinei-daitx
claudinei-daitx / SparkSessionKryo.scala
Last active March 22, 2023 15:51
Examples to create a Spark Session with Kryo
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
//create a spark session who works with Kryo.
object SparkSessionKryo {
def getSparkSession: SparkSession = {
val spark = SparkSession
.builder
.appName("my spark application name")
.config(getConfig)
@claudinei-daitx
claudinei-daitx / SparkSessionS3.scala
Created December 15, 2017 13:02
Create a Spark session optimized to work with Amazon S3.
import org.apache.spark.sql.SparkSession
object SparkSessionS3 {
//create a spark session with optimizations to work with Amazon S3.
def getSparkSession: SparkSession = {
val spark = SparkSession
.builder
.appName("my spark application name")
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.config("spark.hadoop.fs.s3a.access.key", "my access key")
@claudinei-daitx
claudinei-daitx / get_penultimate_wednesday.sh
Created January 19, 2018 18:37
Get the penultimate Wednesday of the month
#!/bin/bash
start_date=$1
end_date=$2
current_date=
counter=0
index=0
until [ "$current_date" = "$end_date" ]
do
@claudinei-daitx
claudinei-daitx / schema_spy_redshift_configuration.conf
Last active May 31, 2018 01:41
Schema SPY Redshift configuration file
schemaspy.t=redshift
# Optional path to alternative jdbc drivers.
# Download the last redshift JDBC on https://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html#download-jdbc-driver
# In this example, I use the RedshiftJDBC42-1.2.10.1009.jar file.
schemaspy.dp=RedshiftJDBC42-1.2.10.1009.jar
# database properties: host, port number, name user, password
schemaspy.host=<<server_hostname>>
schemaspy.port=<<server_port>>
schemaspy.db=<<database_name>>
schemaspy.u=<<user_name>>