Skip to content

Instantly share code, notes, and snippets.

@invkrh
Last active July 10, 2019 16:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save invkrh/3e9302dd6127e55c5f5d68b72e094136 to your computer and use it in GitHub Desktop.
Save invkrh/3e9302dd6127e55c5f5d68b72e094136 to your computer and use it in GitHub Desktop.
Spark Puzzles
val rdd1 = sc.makeRDD(Seq((1,2),(1,2),(1,2),(1,2))).cache
rdd1.count
val rdd2 = rdd1.map(_._1 + 1).cache
rdd2.count
rdd1.unpersist() // rdd2 in storage
val df1 = Seq((1,2),(1,2),(1,2),(1,2)).toDF("key", "value").cache
df1.count
val df2 = df1.select('key, 'value + 1 as "inc").cache
df2.count
df1.unpersist() // no df in storage ???
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment