Skip to content

Instantly share code, notes, and snippets.

@velotiotech
Created January 19, 2022 07:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save velotiotech/5b0f65bd43389625bbf97345725c2514 to your computer and use it in GitHub Desktop.
Save velotiotech/5b0f65bd43389625bbf97345725c2514 to your computer and use it in GitHub Desktop.
from pyspark.sql import SparkSession
## creating sparkSession to get entrypoint to spark application
sparkSession = SparkSession\
.builder\
.appName('Write_table_to_hive')\
.enableHiveSupport()\
.getOrCreate()
## reading data from dataset "Video_Games_5.json"
GamesReviewDataFrame = sparkSession.read.format("json") \
.format("json") \
.option("path", "/home/velotio/Downloads/UnstructuredData/Video_Games_5.json")\
.load()
## we can modify data the way we want to represent in table here
GamesReviewDataFrame.show()
## writing dataframe "GamesReviewDataFrame" as a table in HIVE.
GamesReviewDataFrame.write.saveAsTable("GameReviewTable")
sparkSession.stop()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment