Skip to content

Instantly share code, notes, and snippets.

@RafaelChefe
Created September 5, 2018 00:02
Show Gist options
  • Save RafaelChefe/1f950ddd916ad6c3368de4cc2813646a to your computer and use it in GitHub Desktop.
Save RafaelChefe/1f950ddd916ad6c3368de4cc2813646a to your computer and use it in GitHub Desktop.
the outage document

Check ec2 instance monitoring for any anomalies

  • login to the aws console: https://flippa.signin.aws.amazon.com/console

  • go to the ec2 section

  • on the ec2 dashboard, click instances

  • start by checking key apps, including proxy, rails, search, mfe and workers. Use this search terms to find the desired instances:

    • proxy-production-v1 (proxy)
    • marketplace-webapp-production-v3 (flippa rails)
    • search-production-v17 (search)
    • frontend-marketplace-production-v10 (marketplace frontend)
    • marketplace-worker-production-v3 (flippa rails workers)
  • click the monitoring tab

  • look for steep increases in cpu, network in and other relevant metrics

Check kibana proxy logs for any significant spike in traffic or any weird requests

  • login to kibana: can be found in 1password as flippa elasticsearch
  • select cwl-proxy-production-* on the dropdown to the left
  • select a period of time on the top bar
  • on the chart it's possible to identify possible spikes, and clicking on the bars it's possible to narrow the time period
  • look at the full requests at the bottom of the page

Checking bugsnag for any spike in specific errors

  • login to bugsnag: https://app.bugsnag.com
  • on the top left corner, choose the project you want to check
  • click on Timeline
  • on stage, select production
  • click and drag on the chart to specify a period of time
  • on the bottom there will be a list of all the events that occurred on the selected period

Checking cloudflare for DDOS attacks

  • login to cloudflare: https://dash.cloudflare.com/login
  • click on flippa.com
  • click on analytics
  • check the chart for large spikes in traffic
  • click threats
  • check if there's any spikes in threats

If you find certain IPs are being malicious or DDoSing us, block them:

  • click on firewall on the top of the page
  • scroll down to the Access Rules section, just under Challenge Passage
  • click IP Firewall
  • enter the offending IP address on the text field
  • select "block"
  • select "all organization websites"
  • optionally, add a note with the reason for this blocking
  • click "Add"

Check Sideqik

check queue in admin page

Check marvel dashboard for search issues

If you have identified an issue with elastic search ...add information about marvel dashboard (on repo readme)

Cycling instances

If issue doesn't appear to be related to traffic or malicious requests, try cycling instances ....include info about asg

The login

https://flippa.signin.aws.amazon.com/console


This should live in a place that's easy to access when outages are happening (eg. include a link in opsgenie notifications)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment