Skip to content

Instantly share code, notes, and snippets.

@bepcyc
Created June 19, 2020 15:39
Show Gist options
  • Save bepcyc/fa3ee774a030f77811e4c6a39af84354 to your computer and use it in GitHub Desktop.
Save bepcyc/fa3ee774a030f77811e4c6a39af84354 to your computer and use it in GitHub Desktop.
Get JSON field schema in Spark
val path = "s3://some/dir"
val df = spark.read.parquet(path)
val df2 = df.select($"value") // suppose value is a string with JSON
val ds = df2.as[String]
val dsj = spark.read.json(ds)
val schema = dsj.schema // here is your schema
println(schema.json)
println(schema.toDDL)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment