Skip to content

Instantly share code, notes, and snippets.

@dhermes
Last active April 12, 2023 21:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dhermes/f4cb351d7e6b94abd3f35912dbe05f02 to your computer and use it in GitHub Desktop.
Save dhermes/f4cb351d7e6b94abd3f35912dbe05f02 to your computer and use it in GitHub Desktop.
[2023-04-12] The problem with "as-of-now" computed columns

The problem with "as-of-now" computed columns

The below illustrates why only actually IMMUTABLE functions should be used for PostgreSQL generated columns.

Basic status

Using a function that incorporates the current time, we can generate a status based on the value of a given timestamp:

CREATE FUNCTION foo.before_or_after(when_ TIMESTAMPTZ) RETURNS TEXT
     LANGUAGE plpgsql IMMUTABLE
     AS $$
 BEGIN
   IF (when_ < NOW()) THEN
     RETURN 'before';
   ELSE
     RETURN 'after';
   END IF;
 END
 $$;

Note we have labeled this function as IMMUTABLE but it is not, it relies on NOW(), which is not an input

IMMUTABLE indicates that the function cannot modify the database and always returns the same result when given the same argument values

Generated column

Using this function, we can capture both a when_ timestamp and immediately keep the generated status:

CREATE TABLE foo.bar (
  id INTEGER PRIMARY KEY NOT NULL,
  when_ TIMESTAMPTZ NOT NULL,
  status TEXT GENERATED ALWAYS AS (foo.before_or_after(when_)) STORED NOT NULL
);

Issues

However, the foo.bar.status column is only generated / computed on insert and update to the table. Since it uses NOW(), this means it can go from being valid to being invalid as time passes. Inserting some data:

INSERT INTO
  foo.bar (id, when_)
VALUES
  (1, NOW() - INTERVAL '2 seconds'),
  (2, NOW() + INTERVAL '2 seconds');

and then immediately querying it, we see the generate column remains static even if the current time has moved past when_:

baz=> SELECT id, when_, status, NOW() FROM foo.bar;
 id |             when_             | status |              now
----+-------------------------------+--------+-------------------------------
  1 | 2023-04-12 21:17:28.300036+00 | before | 2023-04-12 21:17:31.194887+00
  2 | 2023-04-12 21:17:32.300036+00 | after  | 2023-04-12 21:17:31.194887+00
(2 rows)

baz=>
baz=> SELECT id, when_, status, NOW() FROM foo.bar;
 id |             when_             | status |              now
----+-------------------------------+--------+-------------------------------
  1 | 2023-04-12 21:17:28.300036+00 | before | 2023-04-12 21:17:35.143072+00
  2 | 2023-04-12 21:17:32.300036+00 | after  | 2023-04-12 21:17:35.143072+00
(2 rows)

"Fixing" it with a VIEW

If the status column were part of a view (instead of a generated column), the value would be recomputed on every query:

CREATE VIEW foo.bar_view AS
  SELECT
    id,
    when_,
    status,
    foo.before_or_after(when_) AS status_view
  FROM
    foo.bar;

and querying this data we do see a difference:

baz=> SELECT * FROM foo.bar_view;
 id |             when_             | status | status_view
----+-------------------------------+--------+-------------
  1 | 2023-04-12 21:17:28.300036+00 | before | before
  2 | 2023-04-12 21:17:32.300036+00 | after  | before
(2 rows)

the actual column in foo.bar only changes if the rows get modified:

baz=> UPDATE foo.bar SET when_ = when_;
UPDATE 2
baz=> SELECT * FROM foo.bar;
 id |             when_             | status
----+-------------------------------+--------
  1 | 2023-04-12 21:17:28.300036+00 | before
  2 | 2023-04-12 21:17:32.300036+00 | before
(2 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment