What Is Observability?
Based on my reading and listening, observability is the ability to answer a wide range of questions about a system's behavior based on previously-captured data. Ideally, it lets you see how a system is performing for various use cases and users in real time, and watch how that changes as new code goes into production.
Observability folks like to talk about it as "testing in production", to which they add that everyone does this, like it or not, because only in production can we see the kinds of edge cases that happen with real data, real traffic, real network conditions, etc. Observability's goal is that when we test in production, we can get much more detailed information than "it works" or "it doesn't work", and thus find and fix problems much more easily.
For example, a user emails and say "doing X in the system is slow for me this morning." With poor observability, you might be able to look at the system's overall latency, or the overall CPU load of the se