Skip to content

Instantly share code, notes, and snippets.

@RJ
Created May 4, 2012 09:35
Show Gist options
  • Save RJ/2593629 to your computer and use it in GitHub Desktop.
Save RJ/2593629 to your computer and use it in GitHub Desktop.
Postgresql (9.1) partitions query plans with constraint exclusion
*** CLICK "RAW" LINK ABOVE TO GET IT LINE WRAPPED ETC ***
\d+ ircevents
Column | Type | Modifiers
-----------+---------+------------------------
buffer | integer | not null
id | bigint | not null <---------------- unixtime in microsecs (since epoch*1,000,000)
type | text | not null
highlight | boolean | not null default false
json | text | not null
...
Child tables: ircevents_201008,
ircevents_201009,
...
ircevents_201211,
ircevents_201212
\d+ ircevents_201204
Table "public.ircevents_201204"
Column | Type | Modifiers | Storage | Description
-----------+---------+------------------------+----------+-------------
buffer | integer | not null | plain |
id | bigint | not null | plain |
type | text | not null | extended |
highlight | boolean | not null default false | plain |
json | text | not null | extended |
Indexes:
"ircevents_201204_idx" UNIQUE, btree (buffer, id)
"ircevents_201204_highlight_idx" btree (highlight) WHERE highlight = true
Check constraints:
"ircevents_201204_id_check" CHECK (id >= (date_part('epoch'::text, '2012-04-01 00:00:00'::timestamp without time zone)::bigint * 1000000) AND id < (date_part('ep
och'::text, '2012-05-01 00:00:00'::timestamp without time zone)::bigint * 1000000))
Inherits: ircevents
This query shows that constraint exclusion is working, in so far as tables
holding earlier data to this query are omitted from the plan:
explain analyze select * from ircevents where
id > date_part('epoch', '2012-04-02'::timestamp without time zone)::bigint * 1000000
and buffer = 116780 limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..3.83 rows=1 width=139) (actual time=0.047..0.047 rows=1 loops=1)
-> Result (cost=0.00..95774.18 rows=25028 width=139) (actual time=0.045..0.045 rows=1 loops=1)
-> Append (cost=0.00..95774.18 rows=25028 width=139) (actual time=0.043..0.043 rows=1 loops=1)
-> Seq Scan on ircevents (cost=0.00..0.00 rows=1 width=77) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((id > 1333317600000000::bigint) AND (buffer = 116780))
-> Bitmap Heap Scan on ircevents_201204 ircevents (cost=1026.99..95708.02 rows=25019 width=139) (actual time=0.041..0.041 rows=1 loops=1)
Recheck Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Bitmap Index Scan on ircevents_201204_idx (cost=0.00..1020.74 rows=25019 width=0) (actual time=0.027..0.027 rows=65 loops=1)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201205_idx on ircevents_201205 ircevents (cost=0.00..8.27 rows=1 width=232) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201206_idx on ircevents_201206 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201207_idx on ircevents_201207 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201208_idx on ircevents_201208 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201209_idx on ircevents_201209 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201210_idx on ircevents_201210 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201211_idx on ircevents_201211 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
-> Index Scan using ircevents_201212_idx on ircevents_201212 ircevents (cost=0.00..8.27 rows=1 width=77) (never executed)
Index Cond: ((buffer = 116780) AND (id > 1333317600000000::bigint))
Total runtime: 0.144 ms
(26 rows)
Let's use a 20-day range spanning only one month:
ircevents=# select date_part('epoch', '2012-04-02'::timestamp without time zone)::bigint * 1000000;
?column?
------------------
1333317600000000
(1 row)
ircevents=# select date_part('epoch', '2012-04-22'::timestamp without time zone)::bigint * 1000000;
?column?
------------------
1335045600000000
(1 row)
ircevents=# explain analyze SELECT id, type, json FROM ircevents WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.074..0.274 rows=40 loops=1)
-> Append (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.072..0.263 rows=40 loops=1)
-> Seq Scan on ircevents (cost=0.00..0.00 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint) AND (buffer = 116780))
-> Bitmap Heap Scan on ircevents_201204 ircevents (cost=914.36..79715.45 rows=20691 width=134) (actual time=0.070..0.252 rows=40 loops=1)
Recheck Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
-> Bitmap Index Scan on ircevents_201204_idx (cost=0.00..909.18 rows=20691 width=0) (actual time=0.053..0.053 rows=40 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
Compare with querying partition directly:
ircevents=# explain analyze SELECT id, type, json FROM ircevents_201204 WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on ircevents_201204 (cost=914.36..79715.45 rows=20691 width=134) (actual time=0.040..0.093 rows=40 loops=1)
Recheck Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
-> Bitmap Index Scan on ircevents_201204_idx (cost=0.00..909.18 rows=20691 width=0) (actual time=0.025..0.025 rows=40 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
Total runtime: 0.122 ms
(5 rows)
This is the most common kind of query i do, "last n" in a buffer:
explain analyze SELECT id, type, json FROM ircevents WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000 ORDER BY id DESC limit 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=80506.28..80506.53 rows=100 width=134) (actual time=0.179..0.196 rows=40 loops=1)
-> Sort (cost=80506.28..80558.01 rows=20692 width=134) (actual time=0.178..0.185 rows=40 loops=1)
Sort Key: public.ircevents.id
Sort Method: quicksort Memory: 33kB
-> Result (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.039..0.121 rows=40 loops=1)
-> Append (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.037..0.111 rows=40 loops=1)
-> Seq Scan on ircevents (cost=0.00..0.00 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint) AND (buffer = 116780))
-> Bitmap Heap Scan on ircevents_201204 ircevents (cost=914.36..79715.45 rows=20691 width=134) (actual time=0.035..0.103 rows=40 loops=1)
Recheck Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
-> Bitmap Index Scan on ircevents_201204_idx (cost=0.00..909.18 rows=20691 width=0) (actual time=0.023..0.023 rows=40 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
Total runtime: 0.243 ms
(13 rows)
Compare the cost of that vs. specifying the table manually:
ircevents=# explain analyze SELECT id, type, json FROM ircevents_201204 WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000 ORDER BY id DESC l
imit 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..404.02 rows=100 width=134) (actual time=0.024..0.071 rows=40 loops=1)
-> Index Scan Backward using ircevents_201204_idx on ircevents_201204 (cost=0.00..83595.11 rows=20691 width=134) (actual time=0.023..0.062 rows=40 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
Total runtime: 0.102 ms
(4 rows)
...And it's the same story if the date range spans multiple tables - the plan is way better if i select the tables myself and do a UNION ALL query over them:
Let's use a 60-day range spanning three months:
ircevents=# select date_part('epoch', '2012-01-02'::timestamp without time zone)::bigint * 1000000;
1325458800000000
ircevents=# select date_part('epoch', '2012-03-02'::timestamp without time zone)::bigint * 1000000;
1330642800000000
ircevents=# explain analyze SELECT id, type, json FROM ircevents WHERE buffer = 116780 AND id BETWEEN 1325458800000000 AND 1330642800000000 ORDER BY id DESC LIMIT 100;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------
Limit (cost=68691.31..68691.56 rows=100 width=135) (actual time=44.719..44.747 rows=100 loops=1)
-> Sort (cost=68691.31..68775.50 rows=33677 width=135) (actual time=44.718..44.731 rows=100 loops=1)
Sort Key: public.ircevents.id
Sort Method: top-N heapsort Memory: 46kB
-> Result (cost=0.00..67404.20 rows=33677 width=135) (actual time=0.064..44.382 rows=364 loops=1)
-> Append (cost=0.00..67404.20 rows=33677 width=135) (actual time=0.063..44.307 rows=364 loops=1)
-> Seq Scan on ircevents (cost=0.00..0.00 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint) AND (buffer = 116780))
-> Index Scan using ircevents_201201_idx on ircevents_201201 ircevents (cost=0.00..8811.15 rows=2181 width=133) (actual time=0.061..1.125 rows
=276 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
-> Index Scan using ircevents_201202_idx on ircevents_201202 ircevents (cost=0.00..54963.83 rows=30613 width=135) (actual time=42.668..43.080
rows=88 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
-> Index Scan using ircevents_201203_idx on ircevents_201203 ircevents (cost=0.00..3629.22 rows=882 width=134) (actual time=0.028..0.028 rows=
0 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
Total runtime: 44.809 ms
(15 rows)
Now we build a UNION query over the partitions ourselves:
EXPLAIN ANALYZE
SELECT id, type, json FROM
(SELECT id, type, json FROM ircevents_201201 WHERE buffer = 116780 AND id BETWEEN 1325458800000000 AND 1330642800000000 ORDER BY id DESC LIMIT 100) as e01
UNION ALL
SELECT id, type, json FROM
(SELECT id, type, json FROM ircevents_201202 WHERE buffer = 116780 AND id BETWEEN 1325458800000000 AND 1330642800000000 ORDER BY id DESC LIMIT 100) as e02
UNION ALL
SELECT id, type, json FROM
(SELECT id, type, json FROM ircevents_201203 WHERE buffer = 116780 AND id BETWEEN 1325458800000000 AND 1330642800000000 ORDER BY id DESC LIMIT 100) as e03
ORDER BY id DESC LIMIT 100;
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------
Limit (cost=0.02..334.53 rows=100 width=134) (actual time=0.066..0.196 rows=100 loops=1)
-> Result (cost=0.02..1003.54 rows=300 width=134) (actual time=0.064..0.174 rows=100 loops=1)
-> Merge Append (cost=0.02..1003.54 rows=300 width=134) (actual time=0.063..0.146 rows=100 loops=1)
Sort Key: ircevents_201201.id
-> Limit (cost=0.00..404.00 rows=100 width=133) (actual time=0.033..0.045 rows=12 loops=1)
-> Index Scan Backward using ircevents_201201_idx on ircevents_201201 (cost=0.00..8811.15 rows=2181 width=133) (actual time=0.032..0.042 rows=
12 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
-> Limit (cost=0.00..179.54 rows=100 width=135) (actual time=0.019..0.074 rows=88 loops=1)
-> Index Scan Backward using ircevents_201202_idx on ircevents_201202 (cost=0.00..54963.83 rows=30613 width=135) (actual time=0.019..0.054 row
s=88 loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
-> Limit (cost=0.00..411.48 rows=100 width=134) (actual time=0.009..0.009 rows=0 loops=1)
-> Index Scan Backward using ircevents_201203_idx on ircevents_201203 (cost=0.00..3629.22 rows=882 width=134) (actual time=0.009..0.009 rows=0
loops=1)
Index Cond: ((buffer = 116780) AND (id >= 1325458800000000::bigint) AND (id <= 1330642800000000::bigint))
Total runtime: 0.254 ms
(14 rows)
ircevents=#
@RJ
Copy link
Author

RJ commented May 4, 2012

This is postgresql 9.1

@RJ
Copy link
Author

RJ commented May 4, 2012

Quoth tom lane:
"I think the reason you're not getting the right
behavior is that you are missing this as-yet-unreleased patch:
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=ef03b34550e3577c4be3baa25b70787f5646c57b
which means it can't figure out that the available index on the child
table produces the desired sort order. If you're in a position to
compile from source, a current nightly snapshot of the 9.1 branch
ought to work for you; otherwise, wait for 9.1.4."

And indeed, the 9.1 devel snapshot fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment