Skip to content

Instantly share code, notes, and snippets.

@openfirmware
Last active January 17, 2024 11:15
Show Gist options
  • Save openfirmware/ffca92fc4e82ccbb3856 to your computer and use it in GitHub Desktop.
Save openfirmware/ffca92fc4e82ccbb3856 to your computer and use it in GitHub Desktop.
Restoring ZFS Snapshots of PostgreSQL Database

ZFS Snapshots and PostgreSQL

I recently set up a host running Ubuntu 14.04 LTS, ubuntu-zfs, and PostgreSQL 9.3. Using the ZFS snapshot feature I was able to make a snapshot of the file system holding the PostgreSQL database, write the snapshot to a compressed file, transfer it to another Ubuntu 14.04 LTS host running ubuntu-zfs, restore it, and launch PostgreSQL using that data directory with all the data (seemingly) intact.

The database is very low-traffic, so I do not know how effective this strategy would be for a busier database.

I take no responsibility if you attempt this setup and it causes data loss or other issues. Use it at your own risk; this document is meant to explain a potential setup and open discussion about limitations and potential improvements.

First Host Setup

I started by creating a pool to hold the PostgreSQL DB, on a simple block file. It isn't recommended to use file stores, as ZFS is better on actual devices, but this is just a quick test.

$ fallocate -l 10G poolfile.zfs
$ sudo zpool create -f -O compression=on -o ashift=12 -o listsnapshots=on -o recordsize=8k -o primarycache=metadata pg-pool /home/vagrant/poolfile.zfs

This creates a pool called pg-pool. We can see it using zpool:

$ sudo zpool list
NAME      SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
pg-pool  9.94G   820K  9.94G     0%  1.00x  ONLINE  -

Next, I installed PostgreSQL 9.3. If you are installing extension (e.g. PostGIS) then they can be added to this step.

$ sudo apt-get install postgresql-9.3 postgresql-contrib-9.3

Once installed, we will copy the default DB cluster to the ZFS pool and change the configuration.

$ sudo service postgresql stop
 * Stopping PostgreSQL 9.3 database server
 ...done.
$ sudo cp -r /var/lib/postgresql /pg-pool/postgresql

Then edit the PostgreSQL configuration to point to the new data directory:

$ sudo vim /etc/postgresql/9.3/main/postgresql.conf
Goto about line 41, and change data_directory:
data_directory = '/pg-pool/postgresql/9.3/main'

Once done and saved, try starting PostgreSQL.

$ sudo service postgresql start
 * Starting PostgreSQL 9.3 database server
 ...done.

Good, it is now running. Try creating some users and databases:

$ sudo -u postgres psql
psql (9.3.4)
Type "help" for help.

postgres=# CREATE ROLE alice;
CREATE ROLE
postgres=# CREATE ROLE bob;
CREATE ROLE
postgres=# CREATE DATABASE alice_db OWNER alice;
CREATE DATABASE
postgres=# CREATE DATABASE bob_db OWNER bob;
CREATE DATABASE
postgres=#\q

Next is creating a snapshot and saving it to a file. With ZFS, creating snapshots is instantaneous. For an example, I will use low Gzip compression on the file; you can use higher compression (-9) to speed up transferring it to another host.

$ sudo zfs snapshot pg-pool@snapshot-1
$ sudo zfs send pg-pool@snapshot-1 | gzip -1 > zfs-backup.gz

This results in a Gzip file we can then transfer to our second host.

Second Host Setup

As this host is for testing, we will set up a simple ZFS pool based on a file store.

$ fallocate -l 10G poolfile.zfs
$ sudo zpool create -f -O compression=on -o ashift=12 -o listsnapshots=on -o recordsize=8k -o primarycache=metadata pg-pool /home/vagrant/poolfile.zfs

Next, copy the compressed snapshot file from the first host to the second. I am using vagrant, so I placed it in the same directory as my Vagrantfile and it is automatically shared at /vagrant in my VM. We will restore it to a directory inside the Second Host's pool, rather than replacing the pool completely.

$ gunzip -c -d /vagrant/zfs-backup.gz | sudo zfs recv pg-pool/restore
$ ls /pg-pool/restore
postgresql

As we can see, the restore has succeeded in decompressing the snapshot and placing the data back onto a file system (even at a different path). Next, install PostgreSQL and change the data directory to the restored snapshot:

$ sudo apt-get install postgresql-9.3 postgresql-contrib-9.3
$ sudo service postgresql stop
 * Stopping PostgreSQL 9.3 database server
 ...done.
$ sudo vim /etc/postgresql/9.3/main/postgresql.conf
Goto about line 41, and change data_directory:
data_directory = '/pg-pool/restore/postgresql/9.3/main'

This should tell PostgreSQL to simply "resume" the data directory as if it were its own, and continue from where the PostgreSQL on the First Host left off at the time of the snapshot.

$ sudo service postgresql start
 * Starting PostgreSQL 9.3 database server
   ...done.
$ sudo -u postgres psql
psql (9.3.4)
Type "help" for help.

postgres=# \du
... role information ...

In the role information, it should list the roles we created on the first host.

postgres=# \l
... database information ...

The database information should also include the database from the first host.

And that is more or less my working setup. There is potential for improvements, such as incremental snapshots and automated snapshots to files.

Why not use PostgreSQL's streaming replication?

Replication is not a replacement for backups, as previous versions of the database are not kept. Simply replicating ZFS file systems from one host to another host is also susceptible to the same limitations. Instead, versioned backups (preferably stored on multiple hosts) are much more effective in allowing you to restore a database to any previous version.

There is also a risk of an errant zpool destroy command deleting all your snapshots in the ZFS filesystem. By keeping snapshots on multiple hosts or in compressed files, this risk can be mitigated.

@xor-gate
Copy link

You need to use pg_backup_start/stop and CHECKPOINT or else the buffers are not correcly flushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment