Skip to content

Instantly share code, notes, and snippets.

@prodeezy
Last active August 30, 2019 11:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save prodeezy/001cf155ff0675be7d307e9f842e1dac to your computer and use it in GitHub Desktop.
Save prodeezy/001cf155ff0675be7d307e9f842e1dac to your computer and use it in GitHub Desktop.
Test Struct based Filter on Iceberg
bash-3.2$ cat people.json
{"name":"Michael"}
{"name":"Andy", "age":30, "friends": {"Josh": 10, "Biswa": 25}, "location": { "lat": 101.123, "lon": 50.324 } }
{"name":"Justin", "age":19, "friends": {"Kannan": 75, "Sanjay": 100}, "location": { "lat": 175.926, "lon": 20.524 } }
spark-shell
import org.apache.spark.sql.types._ ;
import org.apache.iceberg.hadoop.HadoopTables;
import org.apache.iceberg.Schema;
import org.apache.iceberg.spark.SparkSchemaUtil
val schema = new StructType().add("age", IntegerType).add("name", StringType).add("friends", MapType(StringType, IntegerType)).add("location", new StructType().add("lat", DoubleType).add("lon", DoubleType))
val json = spark.read.schema(schema).json("people.json")
json.write.format("iceberg").mode("append").save("iceberg-people-struct")
val tables = new HadoopTables()
val iceSchema = SparkSchemaUtil.convert(schema)
val iceTable = tables.create(iceSchema, "iceberg-people-complex")
iceTable.schema
val iceTable = tables.create(iceSchema, "iceberg-people-struct")
val iceDf = spark.read.format("iceberg").load("iceberg-people-struct")
iceDf.createOrReplaceTempView("iceberg_people_struct")
spark.sql("select location.lat from iceberg_people_struct where location.lat = 101.123 ").show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment