Skip to content

Instantly share code, notes, and snippets.

@kovid-r
Created June 10, 2020 15:08
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kovid-r/2899ac5b00c930995d21be0676683273 to your computer and use it in GitHub Desktop.
Save kovid-r/2899ac5b00c930995d21be0676683273 to your computer and use it in GitHub Desktop.
Reading CSV - PySpark Cheatsheet
# set the file_path variable in the beginning of the file
# or if your Spark application interacts with other applications, parameterize it
file_path = '/Users/kovid-r/datasets/moviedb/movies_metadata.csv'
# method 1 for reading a CSV file
df = spark.read.csv(file_path, header=True)
# method 2 for reading a CSV file
df = spark.read.format(csv_plugin).options(header='true', inferSchema='true').load(file_path)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment