Skip to content

Instantly share code, notes, and snippets.

@bitprophet
Last active December 16, 2015 20:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save bitprophet/5493982 to your computer and use it in GitHub Desktop.
Save bitprophet/5493982 to your computer and use it in GitHub Desktop.
Descartes and external data sources

Cluster graphing

  • It's frequently useful to graph a metric over a cluster of hosts, e.g. "show me the number of requests/s being handled by all of my load balancers".
  • Doing this in vanilla Graphite is easy - it honors both glob expressions (lb*) and brace expressions ({a,b,c,d}).
  • But how do we generate these for clusters whose hostnames don't glob well, and/or whose members change over time?

Descartes

  • Generally flippin' awesome.
  • Has a database of Metrics, Graphs composing 1+ Metrics, and Dashboards composing 1+ Graphs.
  • Metrics are statically defined Graphite target strings plus other metadata.
    • For example, servers.s0200.jvm.memory_used, servers.{lb1,lb2,lb3}.varnish.requests or servers.lb*.varnish.requests.
  • Metric creation is straightforward: via Web or API, you create new Metric objects with new (static) target string values.

Truth databases

  • Think Chef Server, Heira, Clusto, Zookeeper or any other dynamic database concerned with collections of servers.
  • Responsible for organizing servers into logical groups like "the production load balancers" or "Hbase cluster #5".
  • Frequently used to drive configuration management (e.g. generation of monitoring config files) and orchestration tools (e.g. "go Chef the Web app worker cluster").

The problem

  • Tools like Descartes work best with "vanilla" Graphite targets as mentioned above - leveraging globbing or brace expressions to arrive at a static target string.
    • If your servers' hostnames line up this way, you're good - think of a cluster-oriented metric, jot down the metric + glob-expr you need, and you're done.
  • When one's environment requires the use of truth databases - e.g. because you have lots of multi-tenant servers whose (captured in metric paths) hostname can't reflect everything on them; or you have legacy concerns; or etc - hand-entered metric paths no longer work.
    • E.g. if your Varnish pool is no longer expressable as lb* but needs to be {lb1,lb2,app04,app15,s0233}, and then multiply that by all your clusters, that's a huge amount of extra work when manually generating Graphite targets.
    • Cluster memberships may also change frequently - having to redo all graphs related to your LB cluster when you drop or add a node is simply untenable.

Potential solutions

  • Use a statically configured tool (e.g. GDash), leveraging your truth DB + config management to construct the right target paths.
    • E.g. ERb templates that look like servers.{<%= lbs.join(',') %>}.varnish.requests, where lbs comes from a truth DB query.
    • This is sub-optimal: you're not using Descartes' existing/planned features :(
    • Engineers must roundtrip all dashboarding through config management, etc.
  • Run a sync script periodically which polls your truth DB, then creates or edits Descartes Metric, Graph and Dash objects via Descartes' API.
    • Within the realm of possibility, but sync is generally difficult, messy and error-prone.
    • Users may try editing objects in Descartes between syncs, only to see their changes overwritten at sync time.
    • Background sync jobs add operational complexity, can fail without causing obvious errors (silent failure - bad!) etc.
  • Extend Descartes so it understands the idea of splicing external query results into target paths at runtime.
    • E.g. configure a truth DB query endpoint, parameterizable by cluster name or other search token.
    • Allow interpolation syntax within Metric target fields, e.g. servers.<clustername>.varnish.requests.
      • Or servers.*.varnish.requests with an additional per-Metric config option saying to replace the Nth asterisk with a query for clustername. (The other method feels cleaner/more usable, but this is similar to Graphite's own aliasByMetric().)
    • Result is that, on request, display of that Metric hits up the truth DB and results in a final Graphite target of servers.{a,b,c,d}.varnish.requests.
      • Could be cached, within reason.
  • Could maybe even extend further to allow dynamically generated parts of the URI tree?
@obfuscurity
Copy link

FWIW Descartes handles "hand-entered metric paths" just fine as long as they're imported via Graphite URL. It accepts any valid Graphite URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment