Skip to content

Instantly share code, notes, and snippets.

@Dieterbe
Created October 1, 2019 21:18
Show Gist options
  • Save Dieterbe/65e27ce645653c9ec481fc909074d582 to your computer and use it in GitHub Desktop.
Save Dieterbe/65e27ce645653c9ec481fc909074d582 to your computer and use it in GitHub Desktop.

as described in devdocs/expr.md, metrictank resets the consolidation function when it passes a "special" function. a special function is one that changes the nature of the data on a fundamental level. Such as summarize, perSecond, derivative, integral We explain the reasoning for that behavior with a few examples.

example 1

summarize(consolidateBy(statsd.count,"sum"),"24h","sum") 
  • the inputs to summarize, if we need to consolidate, can be summed, because we'll sum the data anyway
  • but if we want to see daily sums, and maxdatapoints dictates that we must consolidate at runtime we don't want the points summed together into buckets of arbitrary N-day-size buckets. that would lead to inaccurate results that would jump in value as you zoom in and out. instead the results should be averaged, so that you still see daily sums as requested.
  • thus the consolidateBy defined at the input side of summarize() should not have effect at the output side of summarize()
  • you can turn this around: put consolidateBy(...,"avg") around the summarize call because the output of summarize should be averaged, but the input should not. so a consolidateBy() call should not pass a "special" function from either direction

example 2

A counter should ideally be rolled up with, and normalized with last, because that's how you get the most up to date datapoints. Alternatively max would result in the same behavior. E.g. you would use perSecond(consolidateBy(counter,"max")) or perSecond(consolidateBy(counter,"last")). However if the consolidateBy setting leaks to the other side (the output side) of the perSecond() call and gets applied during consolidation (to honor max datapoints) then you get the max rates, or the last rate within each timeframe that's being maxdatapoint-consolidated. Neither of these is likely what you wanted to see; avg is a much better default.

Likewise, perhaps you want to compare the minima of rates to the maxima of rates, so you plot both of these:

consolidateBy(perSecond(counter),"min")
consolidateBy(perSecond(counter),"max")

If these min/max settings get utilized for normalization and reading of input data, they will effectively take the counter values at different points in time, and the series won't really line up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment