masonjm/PostgreSQL Replication.md

## PostgreSQL Replication.md

      
    Raw
  

              PostgreSQL Replication.md
            
          
    PostgreSQL Replication

PostgreSQL takes the replication strategy of MySQL and breaks it into two
pieces:

Streaming Replication
Log Shipping Replication

While MySQL combines both of those into a single process, PostgreSQL splits them
apart and lets you use and configure one or both independently.
Streaming Replication

Streaming replication is superficially very similar to MySQL's replication. The
slave database makes a normal connection to the master and receives live updates
as they happen.
While streaming replication is great for up-to-the-second updates, it doesn't
work well if a server is offline for an extended period. Updates aren't queued
on the master while the slave is disconnected, so the slave can't use streaming
replication to "catch up" after a restart.
Log Shipping (a.k.a Transaction Log Archiving)

In place of MySQL's binary log, PostgreSQL uses a transaction log (or "xlog").
Of course, PostgreSQL is smarter at managing its xlogs, and it automatically
reuses old logs to prevent ever increasing disk usage.
To facilitate backup restoration and replication, PostgreSQL includes a
provision to archive old xlogs when they are "flushed." The configuration
contains a simple command template that the server will execute whenever an xlog
file is ready to go. You can get as complicated as you want with the command,
but a simple cp is enough to copy the xlog to an archive location. You could
also setup scp or rsync to copy the files to a remote server.
On the other end of the replication, the slave can be configured to read
archived xlogs from a directory accessible to the server. If the slave has been
offline for an extend period, it will apply the transactions from the archived
logs until it is caught up. If streaming replication is configured, the slave
will switch to that once it is caught up. If streaming replication has not
been configured, it will continue reading new xlog files as they are added to
the archive.
Backup and Recovery

In addition to replication, the xlog archiving is also helpful for restoring
database backups. A full backup will include a snapshot of the database along
with xlogs for all transactions that were applied between the beginning of the
backup and when the backup was complete. The restore process can also leverage
any archived xlogs as a sort of "incremental backup" feature to catch up to a
running master server.
WebApp PostgreSQL Configuration

The WebApp PostgreSQL servers leverage a shared NFS mount to facilitate
production xlog archive replication, production database replication and QA
database recreation.
All production servers in the cluster, as well as any QA servers that represent
the cluster, have access to the same /pg_archive NFS mount. This mount
contains the following folders:

/pg_archive/VERSION/backups: Two full database backups, taken weekly
/pg_archive/VERSION/archive: All archived xlog files since the oldest backup
/pg_archive/VERSION/dumps: Weekly SQL dumps of each separate database

This setup allows PostgreSQL to use "local" directories to read/write the xlog
archive without having to worry about copying to remote servers. It also makes
it easier to refresh QA from a production snapshot. It also provides individual
dump files to make it easier to replicate a single production database in
development.
The master database server has a backup management script running as a cron job
that prevents "bloat" in the backup and archive folders by removing old data.
The script is monitored by thresholds in ZenOSS that will alert if too many or
too few backups are being kept.