Skip to content

Instantly share code, notes, and snippets.

@kylebrandt
Last active April 12, 2021 17:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kylebrandt/6881501ae7a98eaffadec7d532e29a5a to your computer and use it in GitHub Desktop.
Save kylebrandt/6881501ae7a98eaffadec7d532e29a5a to your computer and use it in GitHub Desktop.
alert migration notes

Alert Migration Notes

List of things the migration does or will need to do

  • Load the dashboard alert from alert table and create rules in the alert_rule table
    • Translate conditions and queries into SSE for the alert_rule row
    • Translate other settings like NoData,FOR,Interval,alertRuleTags,(notifications?) for alert_rule row
    • Match Permissions of dashboard alert (create folders as needed):
      • Alerts will not have permissions, Folders will
      • If alert's dashboard has permissions, create corresponding folder to match permissions
      • If no dashboard permissions, and folder, link alert_rule to existing folder
      • If no dashboard permissions and no folder (General/root) - put in new "General Alert Rules ..." with no perms to inherit default perms
  • Modify the dashboard with some pointer to the migrated alert rule
  • Some how make sure annotations can still be viewed?

For notifications, something like the following (ganesh/josh on this):

for each dpa := DashPanelAlert:
    ruleUid, ... := createRule(dpa)

    message = dpa.Message
    templ := createTempate(dpa.Message)
   
    for each ch := dpa.Notifications
        chConfig := ch.getChannelConfig()
        recName, ... := createReciever(dpa.UID, chConfig, templ)
        createRoute(dpa.UID, recName)

Tables

Old (keep for rollback (can add column and rollback, or not?))

  • alert
  • alert_notification
  • alert_notification_state
  • alert_rule_tag What is this for?

New

  • alert_configuration
  • alert_rule / alert_rule_version
  • alert_instance

Contents Modified

  • dashboard
  • dashboard_acl
  • dashboard_version ?
  • annotations (Maybe, but not thinking about this one right now so head no explode all over walls)

Code Notes

  • This will be a CodeMigration interface
  • Examples of CodeMigrations can be found by using go to implementations, for example the AddMissingUserSaltAndRandsMigration
  • The CodeMigrations Exec is called from (mg *Migrator) exec, which is called with mg *Migrator) Start() and is wrapped in mg.inTransaction. So all off the session SQL called within the CodeMigration.Exec() will be executed in a single transaction (I think).
  • A migration can not change after it has been merged to master.
    • Therefore, the code used in the migration generally can not change either.
    • So other services generally can not be referenced because it would make it so those could not be changed. I can however, copy / duplicate types and certain code into my migration.
  • This is going to be far bigger than any other migration, if I can put it in a subpackage under migrations that would likely be cleaner.
  • The order of migrations changes over time (see grafana/grafana#11090) and is an order per table. This a problem for this migration as I will need to touch other tables and expect them to be in a certain state in SQL. (Also related, the service based migrations for alerting and also the library panels feature is also problematic grafana/grafana#32912)
  • Service (e.g. systemD) startup timeouts could be an issue if the service longs run, will need to document this aspect and make sure cloud is aware

General Plan

  • Create a Code Migration, target merge X days before ~beta1
  • Stretch Goal: The migration should leave old stuff around (e.g. alert rules in dashboards, alert table) for downgrade/rollback to 7.
    • If this is done though, without manual intervention any new dashboard alerts created after the downgrade will not be migrated when re-upgrading to 8.
  • Plan A) It looks like the entire CodeMigration is a transaction, so it should just work?
  • Plan B)
    • Work against temporary tables for "Contents Modified", do not write to old tables (Except maybe Migrated to, Although we could do Migrated from on the new tables)
    • If everything has worked, swap temp tables to actual table, else attempt cleanup
      • I'm not sure about swapping tables vs copying data. I think perhaps the copy can be done in transaction, but perhaps the table swap can't?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment