Skip to content

Instantly share code, notes, and snippets.

@timvw
Created October 30, 2018 08:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save timvw/935852c7931498d07c33f5251ad2cf4e to your computer and use it in GitHub Desktop.
Save timvw/935852c7931498d07c33f5251ad2cf4e to your computer and use it in GitHub Desktop.
Get latest message from kafka topic with spark structured streaming (batch)
import spark.implicits._
import org.apache.spark.sql.functions._
val ds = spark.read
.format("kafka")
.option("kafka.bootstrap.servers", bootstrapServers)
.option("subscribe", topic)
.option("startingOffsets", "earliest")
.load()
.orderBy(desc("timestamp"))
.selectExpr("topic", "partition","offset", "timestamp", "CAST(value AS STRING)")
ds.show(numRows = 1, truncate = false)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment