OK. In an effort to learn this stuff (on a Saturday night of course), I also built the tables, indexes, and created fake data.
Here is the big realization: The join is expensive and pointless - it is not needed for the heavy lifting (counting/sorting). Split up into two queries (or a subquery).
First lets do it the original way:
# explain ANALYZE SELECT races.id, races.title, count(participants.id) AS participant_count
FROM races
INNER JOIN participants ON races.id=participants.race_id
GROUP BY races.id, races.title
ORDER BY participant_count DESC
LIMIT 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=42872.35..42872.37 rows=5 width=18) (actual time=851.796..851.797 rows=5 loops=1)
-> Sort (cost=42872.35..42884.85 rows=5000 width=18) (actual time=851.794..851.794 rows=5 loops=1)
Sort Key: (count(participants.id))
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=42739.31..42789.31 rows=5000 width=18) (actual time=851.393..851.593 rows=1000 loops=1)
Group Key: races.id, races.title
-> Hash Join (cost=149.50..35244.52 rows=999305 width=18) (actual time=6.069..511.539 rows=999199 loops=1)
Hash Cond: (participants.race_id = races.id)
-> Seq Scan on participants (cost=0.00..16358.05 rows=999305 width=8) (actual time=0.013..134.292 rows=999199 loops=1)
-> Hash (cost=87.00..87.00 rows=5000 width=14) (actual time=6.040..6.040 rows=5000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 229kB
-> Seq Scan on races (cost=0.00..87.00 rows=5000 width=14) (actual time=0.010..2.221 rows=5000 loops=1)
Planning time: 0.448 ms
Execution time: 852.004 ms
OK. Now, the first thing we can do is just make the hash smaller. We could remove the race.title from the scene and things will speed up just a little bit:
# explain ANALYZE SELECT races.id, count(participants.race_id) AS participant_count
FROM races
INNER JOIN participants ON races.id=participants.race_id
GROUP BY races.id
ORDER BY participant_count DESC
LIMIT 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=40374.09..40374.10 rows=5 width=8) (actual time=709.217..709.217 rows=5 loops=1)
-> Sort (cost=40374.09..40386.59 rows=5000 width=8) (actual time=709.215..709.215 rows=5 loops=1)
Sort Key: (count(participants.race_id))
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=40241.04..40291.04 rows=5000 width=8) (actual time=708.870..709.046 rows=1000 loops=1)
Group Key: races.id
-> Hash Join (cost=149.50..35244.52 rows=999305 width=8) (actual time=4.004..476.482 rows=999199 loops=1)
Hash Cond: (participants.race_id = races.id)
-> Seq Scan on participants (cost=0.00..16358.05 rows=999305 width=4) (actual time=0.012..129.991 rows=999199 loops=1)
-> Hash (cost=87.00..87.00 rows=5000 width=4) (actual time=3.975..3.975 rows=5000 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 176kB
-> Seq Scan on races (cost=0.00..87.00 rows=5000 width=4) (actual time=0.009..1.659 rows=5000 loops=1)
Planning time: 0.447 ms
Execution time: 709.316 ms
(14 rows)
But thats not good enough. Lets just do what we have to get the expensive work done. That means we don't need the races table:
# explain analyze select participants.race_id, count(participants.race_id) as participant_count from participants group by race_id order by participant_count desc limit 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=21381.05..21381.06 rows=5 width=4) (actual time=380.555..380.556 rows=5 loops=1)
-> Sort (cost=21381.05..21383.54 rows=995 width=4) (actual time=380.553..380.553 rows=5 loops=1)
Sort Key: (count(race_id))
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=21354.58..21364.53 rows=995 width=4) (actual time=380.223..380.404 rows=1000 loops=1)
Group Key: race_id
-> Seq Scan on participants (cost=0.00..16358.05 rows=999305 width=4) (actual time=0.015..119.637 rows=999199 loops=1)
Planning time: 0.139 ms
Execution time: 380.620 ms
(9 rows)
From here we can do a select race.title from races...
, which will cost nothing. Or you could work the results from the heavy lifting into the select race.title query if you HAD to use just one query.
Conclusion: We went from 852ms to 380ms. This is with about 1mm rows on participants. I'll let the sim generator run overnight to get to 5mm and see what the delta is then. I bet it will be bigger.
PS: Here are the table details:
# \d races;
Table "public.races"
Column | Type | Modifiers
------------+-----------------------------+----------------------------------------------------
id | integer | not null default nextval('races_id_seq'::regclass)
title | character varying |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Indexes:
"races_pkey" PRIMARY KEY, btree (id)
"index_races_on_title" btree (title)
# \d participants;
Table "public.participants"
Column | Type | Modifiers
------------+-----------------------------+-----------------------------------------------------------
id | integer | not null default nextval('participants_id_seq'::regclass)
race_id | integer | not null
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Indexes:
"participants_pkey" PRIMARY KEY, btree (id)
"index_participants_on_race_id" btree (race_id)
# select count(0) from races;
count
-------
5000
(1 row)
# select count(0) from participants;
count
--------
999199
(1 row)
2 versions, pretty similar, I think postgres is just turning it into our original query underneath
I could have sworn it used the index when I first tried it, but now it isn't, blah