Riemann is a quite simple but powerful monitoring system. In its core it is an event processing framework allowing a user to handle an incoming event stream. Events are small objects that originate from a host, that has a service and metric value associated with it. Events may also have tags and a free text description. This simple data structure enables an application transmit both exception tracebacks and application metric data using the same mechanism.
Riemann has a few problems though. First, it is a single-machine application, only allowing vertical scaling. Secondly, there's no redundancy or availability solution. In this post we'll try to address the first issue.