Skip to content

Instantly share code, notes, and snippets.

@dhesse
Created May 29, 2017 20:01
Show Gist options
  • Save dhesse/7f253e2414db0bb21e775f1e343c5581 to your computer and use it in GitHub Desktop.
Save dhesse/7f253e2414db0bb21e775f1e343c5581 to your computer and use it in GitHub Desktop.
# assume a spark context is given as sc
# and a spark sql context as
rdd = (sc.textFile('data.csv')
.map(lambda x: x.split(';'))
.map(lambda x: Row(name = x[0],
age = int(x[1]))))
df = spark.createDataFrame(rdd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment