Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@lakshay-arora
Created October 26, 2020 19:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lakshay-arora/a49924c2fd0e0387a5078de509a923cb to your computer and use it in GitHub Desktop.
Save lakshay-arora/a49924c2fd0e0387a5078de509a923cb to your computer and use it in GitHub Desktop.
# parallelizing data collection
my_list = [1, 2, 3, 4, 5]
my_list_rdd = sc.parallelize(my_list)
## 2. Referencing to external data file
file_rdd = sc.textFile("path_of_file")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment