Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save agalazis/f3c3d651076602be7877941e14c09faa to your computer and use it in GitHub Desktop.
Save agalazis/f3c3d651076602be7877941e14c09faa to your computer and use it in GitHub Desktop.
Short explanation of the issues faced when trying to create a PostgreSQL index using the date() function and how to resolve it.

Given a table...

CREATE TABLE foo (
  id SERIAL PRIMARY KEY,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
  ...
);

...you realize that you have to perform complex queries with the following type of condition:

WHERE ...
  AND DATE(created_at) = DATE('2013-01-01') ...

The table is big and other indexes are not helping, so you decide to create an index on DATE(created_at):

postgres=> CREATE INDEX ON foo (DATE(created_at));
ERROR:  functions in index expression must be marked IMMUTABLE

It turns out that the function DATE(TIMESTAMP WITH TIME ZONE) is mutable, due to the time zone. On the other hand DATE(TIMESTAMP WITHOUT TIME ZONE) is immutable, as shown in the following example:

postgres=> CREATE TABLE test (created_at TIMESTAMP WITHOUT TIME ZONE DEFAULT NOW());
CREATE TABLE
postgres=> CREATE INDEX ON test (date(created_at));
CREATE INDEX

So how do we handle the case where we have a time zone? Well, since the DATE() function may give different results based on the time zone, we have to decide which time zone is applicable. In this example we use UTC:

postgres=> CREATE INDEX ON foo (DATE(created_at AT TIME ZONE 'UTC'));
CREATE INDEX

That worked. Now how do get the query planner to use it? I was under the impression that I simply had to use the same function I used to create the index in the WHERE condition. Let's see what the query planner says:

postgres=> EXPLAIN SELECT * FROM foo WHERE DATE(created_at AT TIME ZONE 'UTC') = DATE('2013-01-01');
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Seq Scan on foo  (cost...)
   Filter: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)

The index is not used. Let's go look at the index definition:

postgres=> \d foo_date_idx 
       Index "public.foo_date_idx"
 Column | Type |               Definition                
--------+------+-----------------------------------------
 date   | date | date(timezone('UTC'::text, created_at))
btree, for table "public.foo"

Let's try to use the definition as stored by the database. Note that this is (and should be) functionally equivalent to DATE(created_at AT TIME ZONE 'UTC'), since the documentation says:

The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp AT TIME ZONE zone.
postgres=> EXPLAIN SELECT * FROM foo WHERE DATE(TIMEZONE('UTC'::text, created_at)) = DATE('2013-01-01');
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (cost...)
   Recheck Cond: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)
   ->  Bitmap Index Scan on foo_date_idx  (cost...)
         Index Cond: (date(timezone('UTC'::text, created_at)) = '2013-01-01'::date)

This seems to have done the trick. It is strange though that PostgreSQL rewrites the function used to create the index to a canonical form, but does not seem to do the same when the function is used in the WHERE clause (in order to match the index function).

Footnote: PostgreSQL 9.1.9 was used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment