kitallis/pg_repack.md

## pg_repack.md

      
    Raw
  

              pg_repack.md
            
          
    so if table is 350g, free space should be = 700
given you current DB = 900gb, and your disk = 1.3gb, you should bump it up to at least 1.6tb
that will ensure 20% free space and space for an additional table
things to keep in mind before using repack:

1. response time during repack *will* increase – make sure you do it at night
2. all queries will error out during the final swap – expect downtime of a minute or two
3. your disk post repack will fill up faster than usual, because under usual circumstances postgres reuses holes in pages, post repack (or VACUUM) postgres will allocate new pages, this will normalize once it reaches your usual growth of your table
4. if your rate of increase of table/db increase is too high, you will have to repeat this process soon, for eg. if you repack 50gb, and your table grows by 50gb every 5 days, you're back to square 1 after 5 days
5. you can parallelize jobs in repack
6. it runs in a transaction and creates temp table so exiting in the middle is clean
7. cpu *will* spike during repack
8. before running do a test on a snapshot to gather swap time
9. do a dry run on production table with debug logs on (dry-run flag in repack)
10. target table must have a PRIMARY KEY, or at least a UNIQUE total index on a NOT NULL column

read this before doing it https://github.com/reorg/pg_repack/blob/master/doc/pg_repack.rst#details
this is how it replays logs on the main table: https://github.com/reorg/pg_repack/blob/master/lib/repack.c#L756
in my experience, repack has diminishing returns very quick on large tables that update fast, you will have to shard