Skip to content

Instantly share code, notes, and snippets.

@jwhiting
Created May 14, 2018 22:00
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jwhiting/8d1c9f87466bb6e7df6d188bd0ea45ee to your computer and use it in GitHub Desktop.
Save jwhiting/8d1c9f87466bb6e7df6d188bd0ea45ee to your computer and use it in GitHub Desktop.
Warming up EBS volumes on RDS replicas/snapshots (ruby script)
#!/usr/bin/env ruby
require 'pg'
=begin
when creating a postgres db replica in RDS, or restoring from a snapshot, the
underlying EBS volume of the new instance must be initialized by reading every
block, otherwise the blocks will be lazy-initialized by production queries
which will be extremely latent. (i've seen normally 50ms queries take 30s in
some cases on the fresh EBS instance.) because RDS is managed, we do not have
filesystem access, so must use postgres to read the full volume and bring
the replica up to normal disk performance.
for this script to work, several preconditions exist:
1. the pg_prewarm extension must be created. so if you want to warm up a hot
standby replica, run the create extension command on the primary if it doesn't
have it yet. if this is a restore from a backup and the extension doesn't
exist in the backup, create it on the instance directly.
CREATE EXTENSION IF NOT EXISTS pg_prewarm;
2. if running the warmup on a hot standby replica, the replica must have
hot_standby_feedback enabled. because these warmup queries are very
long-running, without feedback from the replica the primary will vacuum pages
that the replica still needs to access, causing the query to be cancelled.
there are other solutions to preventing query conflicts on a replica but this
is the simplest IMO. in order to turn on hot_standby_feedback, you must create
a "parameter group" in RDS console that is based on the default postgres
parameter group, change the hot_standby_feedback to "1" (on), assign this
parameter group to the replica, then reboot the replica.
3. pg gem is installed.
once the replica is ready for the warmup queries, this script can be run
in the following way to set the appropriate env vars:
DB_HOST=<replica endpoint> \
DB_NAME=<db name> \
DB_USER=<username> \
DB_PASS=<password> \
bundle exec ./prewarm_db.rb
=end
puts "connecting: db=#{ENV['DB_NAME']} host=#{ENV['DB_HOST']} user=#{ENV['DB_USER']} password=(not shown)"
conn = PG.connect({
:dbname => ENV['DB_NAME'],
:host => ENV['DB_HOST'],
:user => ENV['DB_USER'],
:password => ENV['DB_PASS'],
})
all_start = Time.now
conn.exec("
SELECT c.oid, relkind, c.relname, c.relpages
FROM pg_class c
JOIN pg_user u ON u.usesysid = c.relowner
WHERE u.usename NOT IN (
'rdsadmin', 'rdsrepladmin', ' pg_signal_backend', 'rds_superuser',
'rds_replication')
ORDER BY c.relpages DESC
") do |res|
res.each do |rel|
start = Time.now
puts "warming up #{rel['relname']} (pages=#{rel['relpages']}, kind=#{rel['relkind']})"
conn.exec("select pg_prewarm(#{rel['oid'].to_i}::regclass)")
finish = Time.now
puts "completed #{rel['relname']} in #{(finish-start).to_i} seconds"
end
end
all_finish = Time.now
puts "all completed in #{((all_finish - all_start) / 60.0).to_i} minutes"
@brauliobo
Copy link

thanks!

@hazemkmammu
Copy link

Thank you for sharing this script!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment