Last active
September 16, 2020 02:30
-
-
Save BryanCutler/bc73d573b7e46a984ff8b6edf228e298 to your computer and use it in GitHub Desktop.
How to create a Spark DataFrame from Pandas or NumPy with Arrow
Thanks @aschmu! The variable spark
is a default SparkSession. I forgot to mention you should be running Jupyter with a PySpark kernel. I put a sample script on how I do this here https://gist.github.com/BryanCutler/b7f10167c4face19e03330a07b24ce21 in case it could be of help. Thanks for the feedback!!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi ! there seems to be some some missing imports !
e.g where does the spark object comme from ?
otherwise nice gist !