Troubleshooting Intro The incident management steps I have in mind when being on-call and getting an alert are: Verify the issue Triage Communicate and scalate if needed Mitigate