Skip to content

Instantly share code, notes, and snippets.

@dwerbam
dwerbam / sparkr-demo
Last active September 8, 2015 14:10 — forked from shivaram/sparkr-demo
SparkR 1.4.1 Demo
# If you are using Spark 1.4, then launch SparkR with the command
#
# ./bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3
# as the `sparkPackages=` flag was only added in Spark 1.4.1.
# # This will work in Spark 1.4.1.
sc <- sparkR.init(spark_link, sparkPackages = "com.databricks:spark-csv_2.10:1.0.3")
sqlContext <- sparkRSQL.init(sc)
flights <- read.df(sqlContext, "s3n://sparkr-data/nycflights13.csv","com.databricks.spark.csv", header="true")
@dwerbam
dwerbam / dataframe_example.R
Last active September 8, 2015 14:09 — forked from shivaram/dataframe_example.R
DataFrame example in SparkR
# Download Spark 1.4 from http://spark.apache.org/downloads.html
#
# Download the nyc flights dataset as a CSV from https://s3-us-west-2.amazonaws.com/sparkr-data/nycflights13.csv
# Launch SparkR using
# ./bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3
# The SparkSQL context should already be created for you as sqlContext
sqlContext
# Java ref type org.apache.spark.sql.SQLContext id 1
@dwerbam
dwerbam / rstudo-sparkr.R
Last active September 8, 2015 14:09 — forked from shivaram/rstudo-sparkr.R
Rstudio local setup
Sys.setenv(SPARK_HOME="/Users/shivaram/spark-1.4.1")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local")
sqlContext <- sparkRSQL.init(sc)
df <- createDataFrame(sqlContext, faithful)
# Select one column
head(select(df, df$eruptions))
@dwerbam
dwerbam / .bash_profile
Last active August 27, 2015 10:03 — forked from natelandau/.bash_profile
Mac OSX Bash Profile
# ---------------------------------------------------------------------------
#
# Description: This file holds all my BASH configurations and aliases
#
# Sections:
# 1. Environment Configuration
# 2. Make Terminal Better (remapping defaults and adding functionality)
# 3. File and Folder Management
# 4. Searching
# 5. Process Management