Skip to content

Instantly share code, notes, and snippets.

@bruckhaus
Created May 30, 2015 22:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bruckhaus/d6f364dff8996b03e20f to your computer and use it in GitHub Desktop.
Save bruckhaus/d6f364dff8996b03e20f to your computer and use it in GitHub Desktop.
val userData = sc.sequenceFile[UserID, UserInfo]("HDFS://...")
.partitionedBy(new HashPartitioner(100))
.persist()
def processNewLogs(logFileName: String) {
val events = sc.sequenceFile[UserID, LinkInfo](logFileName)
val joined = userData.join(events)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment