Skip to content

Instantly share code, notes, and snippets.

@rzykov
Last active October 6, 2021 12:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rzykov/34979a302eaf1ae682470219ccdc492f to your computer and use it in GitHub Desktop.
Save rzykov/34979a302eaf1ae682470219ccdc492f to your computer and use it in GitHub Desktop.
DataAnalysisIntro3.scala
//CODE:
val interestedBrowsers = List("Android", "OS X", "iOS", "Linux", "Windows")
val osAov = dataAov.filter(x => interestedBrowsers.contains(x.osFamily)) //we leave only the desired OS
.filter(_.categoryId == 128) // filter categories
.map(x => (x.osFamily, (x.aov, 1.0))) // need to calculate average purchase amount
.reduceByKey((x, y) => (x._1 + y._1, x._2 + y._2))
.map{ case(osFamily, (revenue, orders)) => (osFamily, revenue/orders) }
.collect()
//OUT
//The output is an array of tuples (tuple) in OS format, the average purchase amount:
Array(
(OS X,4859.827586206897),
(Linux,3730.4347826086955),
(iOS,3964.6153846153848),
(Android,3670.8474576271187),
(Windows,3261.030993042378))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment