Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
How to create a Spark DataFrame from Pandas or NumPy with Arrow
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@aschmu

This comment has been minimized.

Copy link

commented Jul 9, 2019

Hi ! there seems to be some some missing imports !
e.g where does the spark object comme from ?
otherwise nice gist !

@BryanCutler

This comment has been minimized.

Copy link
Owner Author

commented Jul 10, 2019

Thanks @aschmu! The variable spark is a default SparkSession. I forgot to mention you should be running Jupyter with a PySpark kernel. I put a sample script on how I do this here https://gist.github.com/BryanCutler/b7f10167c4face19e03330a07b24ce21 in case it could be of help. Thanks for the feedback!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.