fardjad/considerations-for-running-postgres-on-nfs.md

## considerations-for-running-postgres-on-nfs.md

      
    Raw
  

              considerations-for-running-postgres-on-nfs.md
            
          
    Considerations for Running Postgres on NFS

Background

Over the weekend, I decided to try running a Postgres database in my Homelab. In
my current setup, the most convenient option for storage is NFS. However, NFS is
especially tricky for databases. A misconfigured setup can lead to performance
or data corruption issues.
After watching
this amazing
talk, I realized that as long as we can guarantee that WAL (Write-Ahead Logs)
buffers are written to the network storage, the database should be able to
recover from the errors in case of a failure. Armed with this knowledge, I made
an attempt on storing Postgres data on an NFS share. In the rest of this post,
I'll share more details about the requirements of safely running Postgres on NFS
and the specific NFS options I used.
Safety Requirements

Starting with the
official docs,
there are two important considerations to take into account for NFS:

The only firm requirement for using NFS with PostgreSQL is that the file
system is mounted using the hard option.

And

It is not necessary to use the sync mount option. The behavior of
the async option is sufficient […] However, it is strongly recommended to use
the sync export option on the NFS server on systems where it exists (mainly
Linux).

Adding the hard option to the client mount options is straightforward, but the
second recommendation requires a bit more attention and explanation.
The sync export option on the NFS server differs from the sync mount option
on the client. When a client mounts an NFS share as async, it buffers the
writes and transmits them to the server with a delay (the timing of transmission
depends on the underlying implementation). While this can enhance client write
performance, it does so at the expense of data durability. If the client machine
crashes before transmitting the writes to the server, some data may be lost.
This scenario is not unique to NFS as local file systems behave similarly. A
database such as Postgres is designed to recover in such situations by replying
the Write Ahead Logs, provided they are committed to stable storage. That’s why
Postgres invokes fsync on WAL files.
Enabling async on the server side allows it to respond to requests before
committing changes to storage. However, in the event of an unexpected crash or
power loss, data loss may occur. By enabling the sync option on the NFS
server, we can ensure immediate writing of transmitted data to storage. So as
long as fsync calls succeed, we can be sure that the WAL files are written to
the remote storage.
Fsync Failures

Various factors, including network failures, can lead to fsync call failures.
As a safety measure, Postgres
intentionally panics in
response to these failures. Therefore fsync failures are one of the many
reasons to run Postgres in a
highly available setup.
NFS Config

I’m using a Synology NAS. Therefore enabling the sync option means not
checking the Enable asynchronous box in the NFS rule of the share:

  
NFS Share Options on Synology

Which roughly translates to to the following line in the /etc/exports file:
/path/to/share <IP_CIDR>(rw,sync,no_wdelay,no_root_squash,sec=sys)

And with the help of these
two
pages,
I ended up with the following mount options for the client:
defaults,vers=4.1,proto=tcp,suid,rw,timeo=600,retrans=2,hard,fg,rsize=8192,wsize=8192,noatime,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0

Conclusion

In this post we reviewed the requirements for safely running Postgres on NFS. We
also touched upon fsync and Postgres safeguards against fsync failures.
While our main focus was on NFS setup for Postgres, the information is
transferable to other databases and storage systems.