Skip to content

Instantly share code, notes, and snippets.

@onastavchuk
Created November 2, 2017 16:38
Show Gist options
  • Save onastavchuk/c13ed8051935eefe7d4c368268ff6c0c to your computer and use it in GitHub Desktop.
Save onastavchuk/c13ed8051935eefe7d4c368268ff6c0c to your computer and use it in GitHub Desktop.
val dailyHosts = days.groupBy("month", "day").count()
val totalReqPerDay = df.select(
month(col("time")).alias("month"),
dayofmonth(col("time")).alias("day")
).groupBy("month", "day").count()
val grouping = JavaConversions.asScalaBuffer(listOf("month", "day")).toSeq()
totalReqPerDay.join(dailyHosts, grouping)
.select(
col("month"),
col("day"),
totalReqPerDay.col("count").divide(dailyHosts.col("count"))
.alias("avg")
).sort(col("avg").desc()).show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment