Skip to content

Instantly share code, notes, and snippets.

@alonisser
Last active November 7, 2021 15:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alonisser/8993c51f6930e2cb4e614720af6e45b4 to your computer and use it in GitHub Desktop.
Save alonisser/8993c51f6930e2cb4e614720af6e45b4 to your computer and use it in GitHub Desktop.
Code for medium post
# Create the read stream
dataitemsAppendDf = spark.readStream.format("delta")\
.option("maxFilesPerTrigger", 25)\
.table(f"{database_name}.{table_name}")
# Initiate the write stream
dataitemsAppendDf.writeStream \
.trigger(processingTime='15 seconds') \ # note, you can also have different modes here
.option("checkpointLocation", f"{checkpoint_name}") \
.foreachBatch(processRawStreamBatch) \ #Callback that would handle each batch
.start()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment