Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
This captures how one can create a streaming dataframe for atomically written csv files
from pyspark.sql.types import *
file_schema=StructType([StructField("record_str", StringType())])
file_stream_df = spark.readStream.option("sep", "\n")\
.option("header", "false").schema(file_schema)\
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment