Last active
August 24, 2018 18:03
-
-
Save mage2k/747674bb8a1a970007952adc283a857c to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Given a table bar that we need to compact while maximizing availability of the data in it and minimize load/IO during compaction. | |
First, if there are any tables with foriegn key columns referencing columns in bar those foreign keys need to be disabled, e.g: | |
`ALTER TABLE foo DISABLE TRIGGER fkey_bar_id` | |
Now we can work on compacting the data by copying it to a new table. | |
SQL to create a new, empty table and have the current table inherit from it: | |
``` | |
BEGIN; | |
ALTER TABLE bar RENAME TO bar_old; | |
CREATE TABLE bar (LIKE bar_old INCLUDING ALL); | |
ALTER TABLE bar_old INHERIT bar; | |
ALTER SEQUENCE bar_id_seq OWNED BY bar.id; | |
COMMIT; | |
``` | |
Now we have a new, empty table but queries against it will include all the existing data in bar_old thanks to the inheritance link. | |
Next, loop over batches (e.g. id ranges) from bar_old, deleting them and inserting them into bar (pseudocode): | |
``` | |
min_id := 0 | |
batch_size := 10000 | |
while true | |
-- This query will: | |
-- DELETE up to batch_size rows from bar_old ordered on id | |
-- INSERT the rows deleted from bar_old into bar | |
-- return a result set consisting of the ids of the rows inserted into bar | |
result = execute_sql('WITH del AS ( | |
DELETE FROM bar_old | |
WHERE id IN (SELECT id | |
FROM bar_old | |
WHERE id > #{min_id} | |
ORDER BY id | |
LIMIT #{batch_size}) | |
RETURNING *) | |
INSERT INTO bar | |
SELECT * FROM del | |
RETURNING id;') | |
break if result.count < batch_size | |
min_id := result.ids.max | |
``` | |
Confirm no more rows in bar_old, this should return nothing | |
`SELECT max(id) FROM bar_old;` | |
Drop the old table | |
`DROP TABLE bar_old;` | |
* Fix up index and foreign key names (if needed) on the new bar table | |
* Re-enable any foreign keys previously disabled. This is going to be doing a lot of reading while holdinng a lock on `foo` to check every value of `foo.bar_id` against the `bar.id` values but as long as both sides are properly indexed it shouldn't be too bad. | |
`ALTER TABLE foo ENABLE TRIGGER fkey_bar_id` | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment