The only open source visualization project that comes to my mind right now is Twitter Ambrose. You might want to have a look at Ambrose first. It supports the following features in its web UI:
[Ambrose web UI features]
- A table view of all the associated jobs, along with their current state
- Chord and graph diagrams to visualize job dependencies and current state
- An overall script progress bar
Apart from that my personal experience has been with offerings from commercial vendors. Two name but two of them:
Both products come with an API that allows you to extend them and integrate them with your own Ops tool set. Cloudera Manager requires an evaluation license whereas MapR's Dashboard is available in the free M3 distribution if you want to give it a spin. As usual there are pros and cons for each of them.
That said, you can also configure standard Hadoop to sent its metrics to a monitoring tool such as Ganglia (see live demo at UC Berkely Grid). Basically, you just dump metrics into Ganglia and the latter will take care of the visualization/plotting of the various metrics. There are several online guides available that describe how to configure Ganglia for a small Hadoop cluster. If you are running Hadoop 2.x have a look at What is Hadoop Metrics2 for how the metrics system in next-gen Hadoop works in general.
Finally albeit a bit unrelated to your direct question, you can also write custom monitors by calling Hadoop's Java API. It is usually straight-forward to write these custom monitors in a way that is compatible with other Ops infrastructure tools such as Nagios. For instance, one of our custom monitors connects to the JobTracker in order to detect any MapReduce jobs that run for longer than 24 hours (which is in 99% a tell-tale that a job is broken one way or another). Depending on the tool you dump the metrics into you will get visualizations/graphs for free (cf. Ganglia example above).
Hope this helps, Michael