Skip to content

Instantly share code, notes, and snippets.

@NT-D
Last active August 4, 2020 01:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save NT-D/267dbb0466ad2302ca0c83b0533f726c to your computer and use it in GitHub Desktop.
Save NT-D/267dbb0466ad2302ca0c83b0533f726c to your computer and use it in GitHub Desktop.
Consume streaming data with Azure Event Hubs connector on Azure DataBricks
# Need to install com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.16 in your cluster
# Used Databricks Runtime 6.6 (includes Apache Spark 2.4.5, Scala 2.11)
## Setup Azure IoT Hub (Event Hub compatible endpoint) connection information
EVENTHUB_CONNECTION_STRING = "Your connection string" # Endpoint=sb://aaa.windows.net/;SharedAccessKeyName=iothubowner;SharedAccessKey=xxx;EntityPath=bbb
CONSUMER_GROUP = "Your consumer group" # If you don't have consumer group, set $Default here
## Setup Event Hubs settings
event_hub_config = {
"eventhubs.connectionString": spark._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(EVENTHUB_CONNECTION_STRING),
"eventhubs.consumerGroup": CONSUMER_GROUP
}
## Load and visualize streaming data
from pyspark.sql.functions import col
raw_streaming_df = (spark.readStream
.format("eventhubs")
.options(**event_hub_config)
.load()
.withColumn("body", col("body").cast("string"))
)
display(raw_streaming_df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment