Skip to content

Instantly share code, notes, and snippets.

@NikolayS
Last active February 18, 2019 03:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save NikolayS/bb5859f13647eeb4c26a49c2e8227e68 to your computer and use it in GitHub Desktop.
Save NikolayS/bb5859f13647eeb4c26a49c2e8227e68 to your computer and use it in GitHub Desktop.
Why index maintenance (e.g. with pg_repack) is an inevitable thing?
test=# \timing
Timing is on.
test=#
test=#
test=# create table t1 as select i from generate_series(1, 1000000) _(i);
SELECT 1000000
Time: 660.804 ms
test=# create unique index i_t1 on t1(i);
CREATE INDEX
Time: 301.463 ms
test=#
test=# \dt+ t1
                   List of relations
 Schema | Name | Type  |  Owner   | Size  | Description
--------+------+-------+----------+-------+-------------
 public | t1   | table | postgres | 35 MB |
(1 row)

test=# \di+ i_t1
                       List of relations
 Schema | Name | Type  |  Owner   | Table | Size  | Description
--------+------+-------+----------+-------+-------+-------------
 public | i_t1 | index | postgres | t1    | 21 MB |
(1 row)

test=#
test=# -- Now let's delete 50% of rows (every second one)
test=# delete from t1 where i % 2 = 0;
DELETE 500000
Time: 419.494 ms
test=#
test=# vacuum analyze t1;
VACUUM
Time: 249.512 ms
test=#
test=# -- Space is not reclaimed, as expected:
test=# \dt+ t1
                   List of relations
 Schema | Name | Type  |  Owner   | Size  | Description
--------+------+-------+----------+-------+-------------
 public | t1   | table | postgres | 35 MB |
(1 row)

test=# \di+ i_t1
                       List of relations
 Schema | Name | Type  |  Owner   | Table | Size  | Description
--------+------+-------+----------+-------+-------+-------------
 public | i_t1 | index | postgres | t1    | 21 MB |
(1 row)

test=#
test=# -- Now we insert new 500k rows
test=# -- Postgres will use the same space since VACUUM was applied.
test=# -- IMPORTANT!! If we inserted same values (2, 4, 6, etc),
test=# -- they would go to the same positions in the index, and
test=# -- it would NOT get bloated.
test=# insert into t1 select i from generate_series(1000001, 1500000) _(i);
INSERT 0 500000
Time: 964.676 ms
test=# \dt+ t1
                   List of relations
 Schema | Name | Type  |  Owner   | Size  | Description
--------+------+-------+----------+-------+-------------
 public | t1   | table | postgres | 35 MB |
(1 row)

test=# \di+ i_t1
                       List of relations
 Schema | Name | Type  |  Owner   | Table | Size  | Description
--------+------+-------+----------+-------+-------+-------------
 public | i_t1 | index | postgres | t1    | 32 MB |
(1 row)

test=# -- As we can see, table size hasn't changed – we re-used "gaps".
test=# -- But for the index, it's different. We inserted new values,
test=# -- they went into the "right" side of the B-tree, and
test=# -- the index size grown. "Gaps" remained untouched.
test=#
test=# -- So the index is 1.5 times bigger than it could be, 33% of it is bloat.
test=# -- VACUUM FULL proves it:
test=# vacuum full t1;
VACUUM
Time: 858.090 ms
test=# \dt+ t1
                   List of relations
 Schema | Name | Type  |  Owner   | Size  | Description
--------+------+-------+----------+-------+-------------
 public | t1   | table | postgres | 35 MB |
(1 row)

test=# \di+ i_t1
                       List of relations
 Schema | Name | Type  |  Owner   | Table | Size  | Description
--------+------+-------+----------+-------+-------+-------------
 public | i_t1 | index | postgres | t1    | 21 MB |
(1 row)

test=#
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment