Using postgresql 10 on a spinning disk.
postgres=# \d+ t;
Table "public.t"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------+-----------+----------+-------------------------------+----------+--------------+-------------
id | integer | | not null | nextval('t_id_seq'::regclass) | plain | |
the_data | jsonb | | | | extended | |
Indexes:
"t_pkey" PRIMARY KEY, btree (id)
"t_to_tsvector_idx" gin (to_tsvector('english'::regconfig, the_data))
The jsonb
values are ~500 bytes in size each.
postgres=# EXPLAIN analyze SELECT COUNT(*) FROM t;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=158885.61..158885.62 rows=1 width=8) (actual time=18941.636..18941.636 rows=1 loops=1)
-> Gather (cost=158885.40..158885.61 rows=2 width=8) (actual time=18941.624..18941.631 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=157885.40..157885.41 rows=1 width=8) (actual time=18930.993..18930.993 rows=1 loops=3)
-> Parallel Seq Scan on t (cost=0.00..155389.12 rows=998512 width=0) (actual time=1781.971..18877.031 rows=799319 loops=3)
Planning time: 65.893 ms
Execution time: 18943.667 ms
(8 rows)
Unfortunately I didn't use EXPLAIN buffers
when I ran this the first time, so we only have analyze
output.
postgres=# EXPLAIN (buffers, analyze) SELECT COUNT(*) FROM t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=158885.61..158885.62 rows=1 width=8) (actual time=216.548..216.548 rows=1 loops=1)
Buffers: shared hit=1428 read=48220
-> Gather (cost=158885.40..158885.61 rows=2 width=8) (actual time=216.539..216.544 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1428 read=48220
-> Partial Aggregate (cost=157885.40..157885.41 rows=1 width=8) (actual time=214.568..214.568 rows=1 loops=3)
Buffers: shared hit=2387 read=143017
-> Parallel Seq Scan on t (cost=0.00..155389.12 rows=998512 width=0) (actual time=8.865..167.778 rows=799319 loops=3)
Buffers: shared hit=2387 read=143017
Planning time: 0.047 ms
Execution time: 218.602 ms
postgres=# select count(*) from t;
count
---------
2397957
(1 row)
Time: 180.140 ms
postgres=# VACUUM (ANALYZE) t;
VACUUM
Time: 25379.988 ms (00:25.380)
After that:
postgres=# EXPLAIN (buffers, analyze) SELECT COUNT(*) FROM t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=54822.69..54822.70 rows=1 width=8) (actual time=132.144..132.144 rows=1 loops=1)
Buffers: shared hit=52 read=2164 written=404
-> Gather (cost=54822.48..54822.69 rows=2 width=8) (actual time=132.133..132.140 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=52 read=2164 written=404
-> Partial Aggregate (cost=53822.48..53822.49 rows=1 width=8) (actual time=129.863..129.863 rows=1 loops=3)
Buffers: shared hit=155 read=6424 written=1238
-> Parallel Index Only Scan using t_pkey on t (cost=0.43..51330.43 rows=996818 width=0) (actual time=0.023..84.308 rows=799319 loops=3)
Heap Fetches: 0
Buffers: shared hit=155 read=6424 written=1238
Planning time: 0.114 ms
Execution time: 134.729 ms
Now Parallel Index Only Scan
is used, and non-EXPLAIN timings (with \timing
) have improved from 180 ms to 80 ms.