case class ns(name : String, age : Integer, grp : Integer)
sc.parallelize(List[ns]()).toDF.schema
It's not possible to aggregate with collect_set or collect_list with Spark 1.6.1 - It's going to be fixed in 2.0. See http://stackoverflow.com/questions/36963616/spark-scala-dataframes-windowing-to-have-a-wrappedarraystring-accumulative-set