dcramer/gist:4773531

## gistfile1.txt
Track last notified
===================

Alert model
-> binds to many users
- alert_date

Alert on Threshold
==================

Signifies increase from >0 by N%

If an alert is generated, it then sets a key signifying that a potential alert has happened.

- That key expires in 60 seconds???
- key is bound to alert params (e.g. team-wide alert) and the normalized minute

When potential alert happens, we check a for a baseline (15 minutes history?) to ensure that
the anomoly really is something meaningful. The primary purpose is to look for gaps here.

e.g. if our 15 minute interval is [..., 100, 100, 100, 0, 100] we don't want to alert when it
went from 0 to 100 since it was sitting around 100 for most of the time.

Costs:

- No additional writes on incr calls
- Two additional gets per incr (one for the "have we tried to alert" and one for "previous value")
- Entire range call to check on an alert

We could get off of doing this in realtime if we just check periodically, which removes the two additional
gets per call:

- Store a sorted set per project
- Each sorted set contains the number of events seen in the interval (1 minute)
  - An additional set contains the number of unique events seen
- Every minute we iterate this sorted set (we can exploit the queue just like buffers to avoid crons)
  - We clear the results immediately to no-op any concurrent tasks that might try to run
  - The task fires off a set of subtasks that individually check each project
    - Each project's value is compared to the historic value in the last N minutes (15m for redis counters or
      a period of time using the SQL counters)
    - We only alert if an alert has not been seen on this condition in the last N minutes

Notes:

- Nydus optimizes out multiple writes/gets, so its not as expensive as it looks
- Values that are not set need to constitute missing data, and we either need to ignore them or normalize them to the
  average from the before/after points
- Celery has ``expires=datetime.now() + timedelta(days=1)`` on tasks
	Track last notified
	===================

	Alert model
	-> binds to many users
	- alert_date

	Alert on Threshold
	==================

	Signifies increase from >0 by N%

	If an alert is generated, it then sets a key signifying that a potential alert has happened.

	- That key expires in 60 seconds???
	- key is bound to alert params (e.g. team-wide alert) and the normalized minute

	When potential alert happens, we check a for a baseline (15 minutes history?) to ensure that
	the anomoly really is something meaningful. The primary purpose is to look for gaps here.

	e.g. if our 15 minute interval is [..., 100, 100, 100, 0, 100] we don't want to alert when it
	went from 0 to 100 since it was sitting around 100 for most of the time.

	Costs:

	- No additional writes on incr calls
	- Two additional gets per incr (one for the "have we tried to alert" and one for "previous value")
	- Entire range call to check on an alert

	We could get off of doing this in realtime if we just check periodically, which removes the two additional
	gets per call:

	- Store a sorted set per project
	- Each sorted set contains the number of events seen in the interval (1 minute)
	- An additional set contains the number of unique events seen
	- Every minute we iterate this sorted set (we can exploit the queue just like buffers to avoid crons)
	- We clear the results immediately to no-op any concurrent tasks that might try to run
	- The task fires off a set of subtasks that individually check each project
	- Each project's value is compared to the historic value in the last N minutes (15m for redis counters or
	a period of time using the SQL counters)
	- We only alert if an alert has not been seen on this condition in the last N minutes

	Notes:

	- Nydus optimizes out multiple writes/gets, so its not as expensive as it looks
	- Values that are not set need to constitute missing data, and we either need to ignore them or normalize them to the
	average from the before/after points
	- Celery has ``expires=datetime.now() + timedelta(days=1)`` on tasks