Skip to content

Instantly share code, notes, and snippets.

@etorreborre
Created November 10, 2011 00:14
Show Gist options
  • Save etorreborre/1353640 to your computer and use it in GitHub Desktop.
Save etorreborre/1353640 to your computer and use it in GitHub Desktop.
Using different Monoids to extract the max value / max cumulated value on some records, for a given key
/**
* The methods below can be used to extract significant values in log records. For example, we might want to get:
*
* - the maximum execution time per method call
* - the maximum execution time per server
* - the maximum memory consumption per method call
* - the maximum memory consumption per server
*
* - the maximum cumulated execution time per method call
* - the maximum cumulated execution time per server
* - the maximum cumulated memory consumption per method call
* - the maximum cumulated memory consumption per server
*
* To do that, we just need to provide:
*
* - a key: T => K (per method call, per server)
* - a Monoid[T] (to get a measure on elements of type T: execution time, memory,...)
* - an Ordering[T] (to get the max, but it could be the min)
*/
/**
* @return a seq of records where the measure, provided on the Monoid on T is cumulated for each key K
*/
def reduceByKey[T : Monoid, K](records: Seq[T], key: T => K): Seq[T] =
records.foldMap(r => Map(key(r) -> r)).map(_._2)
/**
* @return a seq of records where we keep only the max for each key
*
* this function does a reduceByKey but using a "max Monoid" as a way to reduce the value
*/
def maxByKey[T : Monoid : Ordered , K](records: Seq[T], key: T => K): Seq[T] =
reduceByKey(records, key)(maxIsMonoid)
/**
* @return a seq of records where we keep only the max cumulated value for each key
*
* this function does a first reduceByKey to accumulate values for a given key, according to the Monoid for T
* then it uses reduceByKey with a "max Monoid" to keep only the maximums
*/
def reduceMaxByKey[T : Monoid : Ordering, K](records: Seq[T], key: T => K): Seq[T] =
reduceByKey(reduceByKey(records, key), key)(maxIsMonoid)
/**
* SUPPORT METHODS TO GET A MONOID FROM AN ORDERING
*/
/**
* if there's an Ordering defined on T, we can derive a Semigroup for T elements so that
* "adding" 2 elements only keeps the maximum
*/
def maxIsSemigroup[T : Ordering] = new Semigroup[T] {
def append(t1: T, t2 : =>T): T = implicitly[Ordering[T]].max(t1, t2)
}
/**
* @return a Monoid from a Zero and a Semigroup
*/
def zeroAndSemigroupIsMonoid[T : Semigroup](z: T) = new Monoid[T] {
val zero = z
def append(t1: T, t2 : =>T): T = implicitly[Semigroup[T]].append(t1, t2)
}
/**
* provided that we have an Ordering on elements of type T, and if T has a Zero, then we can
* define a Monoid on elements of type T so that only the max element is kept after an "addition"
*/
def maxIsMonoid[T : Zero : Ordering] = {
zeroAndSemigroupIsMonoid(implicitly[Zero[T]].zero)(maxIsSemigroup(implicitly[Ordering[T]]))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment