Created
May 16, 2020 14:36
-
-
Save siakon89/01e61eb10db7309e8fca30d5a76b735b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Filesystem schemes and URIs | |
|=====================================| | |
| Filesystem | URI Structure | | |
|----------------|--------------------| | |
| Local Fs | file:///path | | |
| HDFS | hdfs://hdfs_path | | |
| S3 | s3://bucket/object | | |
|=====================================| | |
# Loading a file into an RDD | |
rdd = sc.textFile("file:///filename") | |
# Loading a directory into rdd | |
rdd = sc.textFile("file:///dir/") | |
# Loading a directory with miltiple files | |
# in a tuple form ('filename', 'contents') | |
rdd = sc.wholeTextFile("file:///dir/") | |
# Loading all the CSVs from a directory | |
rdd = sc.textFile("file:///dir/*.csv") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment