|/ __/__ ___ _____/ /__|
|_\ \/ _ \/ _ `/ __/ '_/|
|/___/ .__/\_,_/_/ /_/\_\ version 1.1.0|
|Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)|
|Type in expressions to have them evaluated.|
|Type :help for more information.|
|2014-12-02 08:40:25.812 java[2479:1607] Unable to load realm mapping info from SCDynamicStore|
|14/12/02 08:40:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable|
|Spark context available as sc.|
|scala> val babyNamesCSV = sc.parallelize(List(("David", 6), ("Abby", 4), ("David", 5), ("Abby", 5)))|
|babyNamesCSV: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD at parallelize at <console>:12|
|scala> babyNamesCSV.reduceByKey((n,c) => n + c).collect|
|res0: Array[(String, Int)] = Array((Abby,9), (David,11))|
|scala> babyNamesCSV.aggregateByKey(0)((k,v) => v.toInt+k, (v,k) => k+v).collect|
|res1: Array[(String, Int)] = Array((Abby,9), (David,11))|
@MikeC711 - hopefully you already know how to do this. If not, here is the code snippet. Came here looking for the same thing as you, found it in one of the lectures on Spark Fundamentals I from BigDataUniversity.
scala> babyNamesCSV.aggregateByKey((0, 0))( (acc, value) => (acc._1 + value, acc._2 + 1), (acc1, acc2) => (acc1._1 + acc2._1, acc1._2 + acc2._2)) .mapValues(sumCount => 1.0 * sumCount._1 / sumCount._2) .collect res1: Array[(String, Double)] = Array((Abby,4.5), (David,5.5))
NB: This is my $0.02. I've written a fair amount of coursework over the years & I'm just trying to help here.
You might consider using better naming to more clearly illustrate things to newbies. See my fork at https://gist.github.com/matthewadams/b107599a08719b166400; in particular, https://gist.github.com/matthewadams/b107599a08719b166400/revisions. Elaboration follows.
Lastly, the name
I want to count number of products under each category id and price under category...When I want to print countandtotal.take(2).foreach(println) then its shows number format exception .Even I changed intial value 0.0 to 0.0f..please help