Skip to content

Instantly share code, notes, and snippets.

@walteryu
Last active October 8, 2018 15:21
Show Gist options
  • Save walteryu/dc938834021fae2e84f6d8e911dce93b to your computer and use it in GitHub Desktop.
Save walteryu/dc938834021fae2e84f6d8e911dce93b to your computer and use it in GitHub Desktop.
hw3_p3.py
# CSCI E-63 HW3 - Problem 3
# Author: Walter Yu
# Description: PySpark script to connect to MySQL database, register table and display row count.
# Create context and connect to MySQL:
sqlContext= SQLContext(sc)
dfm = sqlc.read.format("jdbc").option("url","jdbc:mysql://localhost/retail_db").option("driver","com.mysql.jdbc.Driver").option("dbtable","departments").option("user","xxxxx").option("password","xxxxx").load()
# Verify schema, create view and display row count:
dfm.printSchema()
dfm.registerTempTable("departments")
dfm
dfm.count()
dfm.collect()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment