Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
import org.apache.spark.sql.SparkSession
object StructuredNetworkWordCount extends App {
val spark = SparkSession
.config("spark.sql.shuffle.partitions", 8)
import spark.implicits._
// Create DataFrame representing the stream of input lines from connection to localhost:9999
val lines = spark.readStream
.option("host", "localhost")
.option("port", 9999)
// Split the lines into words
val words =[String].flatMap(_.split(" "))
// Generate running word count
val wordCounts = words.groupBy("value").count()
// Start running the query that prints the running counts to the console
val query = wordCounts.writeStream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment