Skip to content

Instantly share code, notes, and snippets.

@pwillis-els
Last active November 10, 2021 17:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pwillis-els/b01b22f1b967a228c31db3cf2789ee13 to your computer and use it in GitHub Desktop.
Save pwillis-els/b01b22f1b967a228c31db3cf2789ee13 to your computer and use it in GitHub Desktop.
Simple Atomic File Locking in Linux

Simple Atomic File Locking in Linux

If you have access to a traditional programming language, there are many methods1 to use2 locks in linux3. However, we don't necessarily have access to those methods within a shell script. In addition, using locks over different kinds of filesystems (such as NFS) can also have inconsistencies and bugs.

What if you just want a very simple form of locking that works on all filesystems? The answer is Maildir locking. The way Qmail / Maildir works is specific to mail files, so I'll break it down in a more general way below. You also don't have to strictly follow this method; the general idea can be modified.

There are three directories:

  • tmp
  • new
  • cur

The way the locking works is like this:

  1. A process 123 creates a new file A in directory tmp.

    A is a file name which should be unique. Only process 123 knows that this file exists. Once the process is done writing to the file, the process moves to step 2. (Waiting for the file to finish writing prevents any other process from picking up the file before it's done being written)

  2. Process 123 creates a hard link from tmp/A into new/A, and then removes tmp/A.

    At this point it is expected that some other process may now "do something" with the file in the new directory, typically reading the file. If the file needs to be exclusively locked by a new process, it should follow the next step.

  3. A new process 456 creates a hard link of new/A into cur/A.

    link() is an atomic operation, even over NFS, so we can be sure that the operation will fail if another process already successfully link()ed the file. If the operation succeeds, we can be sure that our process 456 has exclusive control of this file now.

    (Note: rename() is not atomic over NFS, which is why we use link())

    At this point the 456 process can safely remove new/A if it wants that file to no longer be processed by new processes. Even if another process had already seen new/A and began to lock it, it would fail due to cur/A already existing.

Using this method you can implement an atomic locking queue by linking files. You can also use filename suffixes or prefixes rather than directories. The important part is just relying on link() for an atomic operation, and respecting a failed link() call to avoid conflicts.

So in practice it could look like this:

#!/usr/bin/env bash
mkdir tmp new cur
echo "hello world" > tmp/A
if ! ln tmp/A new/A ; then
    echo "Error: new/A already exists! Duplicate file name?"
    exit 1
fi
rm -f tmp/A
if ! ln new/A cur/A ; then
    echo "Error: cur/A already exists! File is locked!"
    exit 1
fi
rm -f new/A

There are two dangers with the above method:

  • it doesn't account for stale locks
  • a file with the same name may already exist
  • nfs may cache file names

To deal with the first two problems, Maildir format suggests a naming convention to include unique information such as the PID of a process, a timestamp, and potentially a random identifier. This prevents conflicting files from causing bugs, and allows a separate process to clean up stale files (assuming the PID referenced by a file no longer exists or belongs to a different program, or a timestamp has expired).


Notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment