Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@dgadiraju
Created February 4, 2018 22:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dgadiraju/ba57e0c3e791cc8dc3acfd007baf3165 to your computer and use it in GitHub Desktop.
Save dgadiraju/ba57e0c3e791cc8dc3acfd007baf3165 to your computer and use it in GitHub Desktop.
l = range(1, 10000)
lRDD = sc.parallelize(l)
productsRaw = open("/data/retail_db/products/part-00000").read().splitlines()
type(productsRaw)
productsRDD = sc.parallelize(productsRaw)
type(productsRDD)
productsRDD.first()
for i in productsRDD.take(10): print(i)
productsRDD.count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment