Skip to content

Instantly share code, notes, and snippets.

View 0x0FFF's full-sized avatar

Alexey Grishchenko 0x0FFF

View GitHub Profile
@rxin
rxin / df.py
Last active January 26, 2017 00:44
DataFrame simple aggregation performance benchmark
data = sqlContext.load("/home/rxin/ints.parquet")
data.groupBy("a").agg(col("a"), avg("num")).collect()