Skip to content

Instantly share code, notes, and snippets.

@DrTom
Last active January 13, 2021 10:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DrTom/1119a596239b828b42bf97387e0aa61f to your computer and use it in GitHub Desktop.
Save DrTom/1119a596239b828b42bf97387e0aa61f to your computer and use it in GitHub Desktop.

New Audits for leihs

Problems "Existing Audits"

  • based on Model lifecycle methods, aka Hooks, in leihs legacy

    • works only in leihs legacy
    • works only when using certain methods of the models
    • reliability questionable => more like some effort logging, nothing like audits
  • does not work for other apps procure, admin, my, new-borrow

  • huge data consumption

  • nice UI but queries have become unusable slow

  • privacy problem (leihs admin vs system admin)

Why Now

Collecting some data since almost two years. Bun no UI and Proof of Concept. If we want to remove leihs-legacy we need a audit alternative that goes at least 2 years back.

Goals

  • indisputable reliability
  • conceptual framework which works for all applications and technologies
  • works for large data, efficiency

Non goals: super UI, accessible for everybody. This is an expert feature.

Design, Technical Implementation

Part Database

Record changes via DB operations INSERT, UPDATE, and DELETE via triggers in the audited_changes table.

                          Table "public.audited_changes"
   Column   |           Type           | Collation | Nullable |      Default
------------+--------------------------+-----------+----------+--------------------
 id         | uuid                     |           | not null | uuid_generate_v4()
 txid       | uuid                     |           | not null | txid()
 tg_op      | text                     |           | not null |
 table_name | text                     |           | not null |
 changed    | jsonb                    |           |          |
 created_at | timestamp with time zone |           |          | now()
 pkey       | text                     |           |          |
CREATE OR REPLACE FUNCTION audit_change()
RETURNS TRIGGER AS $$
  DECLARE
    changed JSONB;
    j_new JSONB := '{}'::JSONB;
    j_old JSONB := '{}'::JSONB;
    pkey TEXT;
    pkey_col TEXT := (
                SELECT attname
                FROM pg_index
                JOIN pg_attribute ON
                    attrelid = indrelid
                    AND attnum = ANY(indkey)
                WHERE indrelid = TG_RELID AND indisprimary);
BEGIN
  IF (TG_OP = 'DELETE') THEN
    j_old := row_to_json(OLD)::JSONB;
    pkey := j_old ->> pkey_col;
  ELSIF (TG_OP = 'INSERT') THEN
    j_new := row_to_json(NEW)::JSONB;
    pkey := j_new ->> pkey_col;
  ELSIF (TG_OP = 'UPDATE') THEN
    j_old := row_to_json(OLD)::JSONB;
    j_new := row_to_json(NEW)::JSONB;
    pkey := j_old ->> pkey_col;
  END IF;
  changed := jsonb_changed(j_old, j_new);
  if ( changed <> '{}'::JSONB ) THEN
    INSERT INTO audited_changes (tg_op, table_name, changed, pkey)
      VALUES (TG_OP, TG_TABLE_NAME, changed, pkey);
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE OR REPLACE FUNCTION jsonb_changed(jold JSONB, jnew JSONB)
RETURNS JSONB AS $$
DECLARE
  result JSONB := '{}'::JSONB;
  k TEXT;
  v_new JSONB;
  v_old JSONB;
BEGIN
  FOR k IN SELECT * FROM jsonb_object_keys(jold || jnew) LOOP
    if jnew ? k
      THEN v_new := jnew -> k;
      ELSE v_new := 'null'::JSONB; END IF;
    if jold ? k
      THEN v_old := jold -> k;
      ELSE v_old := 'null'::JSONB; END IF;
    IF k = 'updated_at' THEN CONTINUE; END IF;
    IF v_new = v_old THEN CONTINUE; END IF;
    result := result || jsonb_build_object(k, jsonb_build_array(v_old, v_new));
  END LOOP;
  RETURN result;
END;
$$ LANGUAGE plpgsql;

Part 2: Connecting Changes with Requests, Users and HTTP

                    Table "public.audited_requests"
   Column   |           Type           | Collation | Nullable | Default
------------+--------------------------+-----------+----------+---------
 txid       | uuid                     |           | not null | txid()
 user_id    | uuid                     |           |          |
 path       | text                     |           |          |
 method     | text                     |           |          |
 created_at | timestamp with time zone |           |          | now()
 http_uid   | text                     |           |          |

Every changing request runs within a db transaction: do all or nothing.

CREATE OR REPLACE FUNCTION txid()
RETURNS uuid AS $$
BEGIN
  RETURN uuid_generate_v5(uuid_nil(), current_date::TEXT || ' ' || txid_current()::TEXT);
END;
$$ LANGUAGE plpgsql;
  1. Browser Request: PATCH leihs:/admin/users/123
  2. reverse proxy: logging, HTTP_UID header, see also mod_unique_id
  3. leihs-admin: creates transaction; saves entry in audited_requests
  4. in leihs-admin changes happen somewhere, DB trigger fire and write to audited_changes, setting also txid
  5. leihs-admin: closes transaction; saves entry in audited_responses
  6. reverse proxy: returns to Browser

Implemented for:

  • clojure RESTful Services leihs-admin, leihs-my
  • rails, leihs-legacy by Matus
  • open: clojure GraphQL Servies leihs-procure, leihs-new-borrow

Size Optimization of Audited Data

            relation            | total_size
--------------------------------+------------
 public.audits                  | 5093 MB
 public.audited_requests        | 2534 MB
 public.audited_changes         | 2328 MB
 public.audited_responses       | 521 MB


            relation            | total_size
--------------------------------+------------
 public.audits                  | 5263 MB
 public.audited_changes         | 455 MB
 public.audited_requests        | 231 MB
 public.audited_responses       | 147 MB

# data collection based on snapshots a few weeks apart


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment