Skip to content

Instantly share code, notes, and snippets.

@DragonCircle
DragonCircle / Spark Dataframe Cheat Sheet.py
Created August 2, 2016 04:30 — forked from evenv/Spark Dataframe Cheat Sheet.py
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data