Skip to content

Instantly share code, notes, and snippets.

@ppillay
Created June 16, 2017 04:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ppillay/94cd617ecd369b3ffd17e627fae94d2c to your computer and use it in GitHub Desktop.
Save ppillay/94cd617ecd369b3ffd17e627fae94d2c to your computer and use it in GitHub Desktop.
case class FlightDetails(year: Int, month: Int, day_of_month: Int, airline_Id: Int, origin: String, dest: Int, delay: Double)
val ds = df.as[FlightDetails]
def averageDelay = typedAvg[FlightDetails](_.delay)
.name("average_delay")
val resultDS = ds.filter(_.delay > 0)
.groupByKey(x => (x.origin, x.dest) )
.agg(averageDelay)
.sort($"average_delay".desc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment