This table has 1,000,000 rows, consisting of a UUIDv4 PK, a random int of range (1,1000000), and ~1 KiB of Lorem Ipsum text.
postgres=# \d+ uuid_pk
Table "public.uuid_pk"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
id | uuid | | not null | | plain | | |
user_id | integer | | not null | | plain | | |
lorem | text | | not null | | extended | | |
Indexes:
"uuid_pk_pkey" PRIMARY KEY, btree (id)
Access method: heap
postgres=# SELECT
pg_size_pretty(index_size) AS index_size,
internal_pages, leaf_pages,
avg_leaf_density,
leaf_fragmentation
FROM pgstatindex('uuid_pk_pkey');
index_size | internal_pages | leaf_pages | avg_leaf_density | leaf_fragmentation
------------+----------------+------------+------------------+--------------------
38 MB | 24 | 4854 | 71.15 | 50.02
(1 row)
postgres=# SELECT
pg_size_pretty(table_len) AS table_size,
tuple_count,
tuple_percent,
pg_size_pretty(free_space) AS free_space,
free_percent
FROM pgstattuple('public.uuid_pk');
table_size | tuple_count | tuple_percent | free_space | free_percent
------------+-------------+---------------+------------+--------------
1116 MB | 1000000 | 91.86 | 78 MB | 7.03
(1 row)
The table was then clustered using the UUID PK:
postgres=# CLUSTER VERBOSE uuid_pk USING uuid_pk_pkey;
INFO: clustering "public.uuid_pk" using sequential scan and sort
INFO: "public.uuid_pk": found 0 removable, 1000000 nonremovable row versions in 142858 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 5.48 s, system: 10.02 s, elapsed: 49.93 s.
CLUSTER
postgres=# SELECT
pg_size_pretty(index_size) AS index_size,
internal_pages, leaf_pages,
avg_leaf_density,
leaf_fragmentation
FROM pgstatindex('uuid_pk_pkey');
index_size | internal_pages | leaf_pages | avg_leaf_density | leaf_fragmentation
------------+----------------+------------+------------------+--------------------
30 MB | 20 | 3832 | 90.03 | 0
(1 row)
The table was analyzed to pick up the new changes, then an EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*)
was ran:
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM uuid_pk;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=26621.03..26621.04 rows=1 width=8) (actual time=315.899..329.927 rows=1 loops=1)
Buffers: shared hit=47 read=3802
-> Gather (cost=26620.82..26621.03 rows=2 width=8) (actual time=315.495..329.878 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=47 read=3802
-> Partial Aggregate (cost=25620.82..25620.83 rows=1 width=8) (actual time=257.535..257.537 rows=1 loops=3)
Buffers: shared hit=47 read=3802
-> Parallel Index Only Scan using uuid_pk_pkey on uuid_pk (cost=0.42..24579.15 rows=416669 width=0) (actual time=0.221..200.480 rows=333333 loops=3)
Heap Fetches: 0
Buffers: shared hit=47 read=3802
Planning:
Buffers: shared hit=14
Planning Time: 1.130 ms
Execution Time: 330.379 ms
Next, the table was re-clustered using user_id
, which is a random int – this has the effect of randomly ordering the UUID.
To do this, a temporary index was created on the user_id
column.
postgres=# CREATE INDEX uuid_pk_int ON uuid_pk (user_id);
CREATE INDEX
Time: 1783.558 ms (00:01.784)
postgres=# CLUSTER VERBOSE uuid_pk USING uuid_pk_int;
INFO: clustering "public.uuid_pk" using sequential scan and sort
INFO: "public.uuid_pk": found 0 removable, 1000000 nonremovable row versions in 142858 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU: user: 5.06 s, system: 8.81 s, elapsed: 49.16 s.
CLUSTER
Following another analyze, the first SELECT COUNT(*)
preferred to use the index on the integer column.
Notably, it takes about 73% of the time to do the same as with the UUID PK, due to fewer pages being
read (2728 vs 3802).
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM uuid_pk;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=22161.03..22161.04 rows=1 width=8) (actual time=233.789..239.188 rows=1 loops=1)
Buffers: shared hit=15 read=2728
-> Gather (cost=22160.82..22161.03 rows=2 width=8) (actual time=233.630..239.174 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15 read=2728
-> Partial Aggregate (cost=21160.82..21160.83 rows=1 width=8) (actual time=186.978..186.980 rows=1 loops=3)
Buffers: shared hit=15 read=2728
-> Parallel Index Only Scan using uuid_pk_int on uuid_pk (cost=0.42..20119.15 rows=416669 width=0) (actual time=0.314..133.483 rows=333333 loops=3)
Heap Fetches: 0
Buffers: shared hit=15 read=2728
Planning:
Buffers: shared hit=27
Planning Time: 0.569 ms
Execution Time: 239.440 ms
That index was then dropped, and the query repeated. Here, despite nearly all pages being in shared_buffers
, due
to the massive fragmentation causing visibility map lookup misses, leading to 783770 (plus 3834) pages being read.
This took about 85% longer to run than with the integer index, and about 28% longer than when clustered.
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM uuid_pk;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=26621.03..26621.04 rows=1 width=8) (actual time=410.897..422.809 rows=1 loops=1)
Buffers: shared hit=783770 read=3834
-> Gather (cost=26620.82..26621.03 rows=2 width=8) (actual time=410.493..422.775 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=783770 read=3834
-> Partial Aggregate (cost=25620.82..25620.83 rows=1 width=8) (actual time=360.653..360.655 rows=1 loops=3)
Buffers: shared hit=783770 read=3834
-> Parallel Index Only Scan using uuid_pk_pkey on uuid_pk (cost=0.42..24579.15 rows=416669 width=0) (actual time=0.305..303.873 rows=333333 loops=3)
Heap Fetches: 0
Buffers: shared hit=783770 read=3834
Planning:
Buffers: shared hit=6 dirtied=1
Planning Time: 0.435 ms
Execution Time: 422.962 ms
On a cold run (stop Postgres, drop OS caches, restart Postgres), it's understandably even worse:
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM uuid_pk;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=26620.97..26620.98 rows=1 width=8) (actual time=760.739..771.814 rows=1 loops=1)
Buffers: shared hit=783756 read=3839
-> Gather (cost=26620.76..26620.97 rows=2 width=8) (actual time=760.466..771.794 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=783756 read=3839
-> Partial Aggregate (cost=25620.76..25620.77 rows=1 width=8) (actual time=744.637..744.639 rows=1 loops=3)
Buffers: shared hit=783756 read=3839
-> Parallel Index Only Scan using uuid_pk_pkey on uuid_pk (cost=0.42..24579.09 rows=416667 width=0) (actual time=2.078..683.930 rows=333333 loops=3)
Heap Fetches: 0
Buffers: shared hit=783756 read=3839
Planning:
Buffers: shared hit=40 read=21
Planning Time: 33.432 ms
Execution Time: 773.579 ms
In comparison, a cold run with the index on the integer column:
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM uuid_pk;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=22161.03..22161.04 rows=1 width=8) (actual time=549.814..559.344 rows=1 loops=1)
Buffers: shared hit=10 read=2733
-> Gather (cost=22160.82..22161.03 rows=2 width=8) (actual time=549.634..559.323 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=10 read=2733
-> Partial Aggregate (cost=21160.82..21160.83 rows=1 width=8) (actual time=531.750..531.752 rows=1 loops=3)
Buffers: shared hit=10 read=2733
-> Parallel Index Only Scan using uuid_pk_int on uuid_pk (cost=0.42..20119.15 rows=416669 width=0) (actual time=1.832..462.815 rows=333333 loops=3)
Heap Fetches: 0
Buffers: shared hit=10 read=2733
Planning:
Buffers: shared hit=53 read=27 dirtied=2
Planning Time: 31.980 ms
Execution Time: 561.492 ms