Redesign of logging and metrics infrastructure
- Lack of centralized tools for monitoring and alerting
- High cost (BigQuery) and slow performance of existing tools (CloudWatch/StackDriver)
- Lack of detailed instrumentation required by analysts
- Complex architecture for collecting and processing logs
Top three items increase time to investigate problems and react to fire alarms.