Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Created November 23, 2017 11:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dgadiraju/35c070384c0d8b67e919d24ce38814a8 to your computer and use it in GitHub Desktop.
Save dgadiraju/35c070384c0d8b67e919d24ce38814a8 to your computer and use it in GitHub Desktop.
val orders = sc.textFile("/public/retail_db/orders")
// Previewing data
orders.first
orders.take(10).foreach(println)
orders.count
// Use collect with care.
// As it creates single threaded list from distributed RDD,
// using collect on larger datasets can cause out of memory issues.
orders.collect.foreach(println)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment