Skip to content

Instantly share code, notes, and snippets.

@alexdean
Created October 13, 2010 15:51
Show Gist options
  • Save alexdean/624312 to your computer and use it in GitHub Desktop.
Save alexdean/624312 to your computer and use it in GitHub Desktop.
These are some user stories I scraped together to help guide the rewrite of ganglia's web application.
Overview
========
Most of the time, I want to just quickly have a look at the overall status of our systems, and verify that everything is
running smoothly. I tend to browse around the app, not necessarily looking for anything specific. I just want to be able
to see that everything looks 'normal'. If we receive a problem report (usually something vague like 'the web site is slow'),
ganglia is often the first place I look for any sign of trouble.
Here, I find the most value in being able to visualize lots of data at once, and in being able to quickly navigate from
view to another. I'm most interested in what's happening right now, and in the very recent past.
Investigation
=============
When we know something is wrong, it's often the case that several systems will show unusual metrics. It can be unclear which
is a cause and which is an effect. In these cases, I use ganglia to start building a timeline of what started going wrong
when. This often helps point us in the direction of a root cause.
In this case, I am much more focused on making comparisons between metrics (often comparing 1 cluster to another) to see how
various issues correlate in time.
In this mode, being able to quickly compare things is very valuable. I tend to have several browser windows open at once,
all watching different clusters or nodes. I sometimes want to look several hours or days into the past, to see when an issue
may have started to crop up. Comparing recent data to longer timeframes is also helpful, to get a better sense of what a
'normal' value is for a given metric. When a picture starts to come together, it's helpful to take screenshots to communicate
to others what I think is going on.
Integration
===========
Many people want ganglia to expose its data to external applications, using standard web methodologies like REST. This can
be used to integrate ganglia data into custom dashboards or other monitoring systems like Nagios.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment