Skip to content

Instantly share code, notes, and snippets.

@amalgjose
Created October 13, 2019 07:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amalgjose/50e5a9998682ffb7455dbfd69fc1f04a to your computer and use it in GitHub Desktop.
Save amalgjose/50e5a9998682ffb7455dbfd69fc1f04a to your computer and use it in GitHub Desktop.
Code snippet for demonstrating Delta Lake
%python
# Create a temparory dataset
data = spark.range(0, 50)
data.write.format("delta").save("/tmp/myfirst-delta-table")
# Read the data
df = spark.read.format("delta").load("/tmp/myfirst-delta-table")
df.show()
# Updating the dataset
data = spark.range(51, 100)
data.write.format("delta").mode("overwrite").save("/tmp/myfirst-delta-table")
# Read the data
df = spark.read.format("delta").load("/tmp/myfirst-delta-table")
df.show()
# Read the older version of data
df = spark.read.format("delta").option("versionAsOf", 0).load("/tmp/myfirst-delta-table")
df.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment