Skip to content

Instantly share code, notes, and snippets.

Avatar

Matthew Powers MrPowers

View GitHub Profile
@MrPowers
MrPowers / person_data.csv
Last active Oct 19, 2019
Data for 100 fake people
View person_data.csv
person_name person_country
a China
b China
c China
d China
e China
f China
g China
h China
i China
@MrPowers
MrPowers / programming_websites.csv
Last active Oct 19, 2019
List of programming websites
View programming_websites.csv
website_url website_type main_language
news.ycombinator.com aggregator
mungingdata.com blog spark
m.signalvnoise.com blog rails
pgexercises.com train postgres
codequizzes.com train ruby
View spark_write_to_aws.sc
df.coalesce(1).write
.format("com.databricks.spark.csv")
.option("header", "true")
.save("s3n://some_bucket/data/states/all_states/")
View spark_aws_credentials.sc
val accessKeyId = System.getenv("AWS_ACCESS_KEY_ID")
val secretAccessKey = System.getenv("AWS_SECRET_ACCESS_KEY")
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", accessKeyId)
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", secretAccessKey)
View spark_s3_files.sc
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("inferSchema", "true")
.load("s3n://some_bucket/data/states/*.csv")
View spark_dataframe_multiple_gzipped_files.sc
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("inferSchema", "true")
.load(System.getProperty("user.home") + "/Desktop/people/*.gz")
View spark_dataframe_multiple_files.sc
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("inferSchema", "true")
.load(System.getProperty("user.home") + "/Desktop/people/*.csv")
@MrPowers
MrPowers / spark_dataframe_to_csv.sc
Last active Apr 4, 2016
Writing a Spark DataFrame to a CSV file
View spark_dataframe_to_csv.sc
tx_cities.coalesce(1).write
.format("com.databricks.spark.csv")
.option("header", "true")
.save(System.getProperty("user.home") + "/Desktop/texas_cities")
@MrPowers
MrPowers / spark_dataframe.sc
Last active Apr 4, 2016
Creating a Spark DataFrame
View spark_dataframe.sc
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("inferSchema", "true")
.load(System.getProperty("user.home") + "/Desktop/cities.csv")
You can’t perform that action at this time.