Skip to content

Instantly share code, notes, and snippets.

@fyyying
Last active June 28, 2020 20:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fyyying/aafaf01722eb3470db30342440b06ace to your computer and use it in GitHub Desktop.
Save fyyying/aafaf01722eb3470db30342440b06ace to your computer and use it in GitHub Desktop.
# Path on gist
path = "https://gist.githubusercontent.com/fyyying/4aa5b471860321d7b47fd881898162b7/raw/e8606de9a82e13ca6215b340ce260dad60469cba/titanic_dataset.csv"
# Read from local
df = spark.read.csv("titanic_dataset.csv", header=True, inferSchema=True)
# Read from url
# One more step required to add the url into file
spark.sparkContext.addFile(path)
df = spark.read.csv(SparkFiles.get("titanic_dataset.csv"), header=True, inferSchema=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment