Last active
September 16, 2020 02:30
How to create a Spark DataFrame from Pandas or NumPy with Arrow
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Thanks @aschmu! The variable spark
is a default SparkSession. I forgot to mention you should be running Jupyter with a PySpark kernel. I put a sample script on how I do this here https://gist.github.com/BryanCutler/b7f10167c4face19e03330a07b24ce21 in case it could be of help. Thanks for the feedback!!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi ! there seems to be some some missing imports !
e.g where does the spark object comme from ?
otherwise nice gist !