Skip to content

Instantly share code, notes, and snippets.

@Robertof
Last active February 6, 2023 00:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Robertof/abfc79660a3140a7c39057abd445e397 to your computer and use it in GitHub Desktop.
Save Robertof/abfc79660a3140a7c39057abd445e397 to your computer and use it in GitHub Desktop.
node-exporter prevents systemd automount/autofs expiry (TimeoutIdleSec, x-systemd.idle-timeout)

Do you have an automount configured using systemd (or another autofs-based daemon) with an expiry/timeout and it's not automatically unmounting? i.e., is x-systemd.idle-timeout or TimeoutIdleSec not working?

Do you use Prometheus' node_exporter?

The problem

If you answered yes to both questions, you're in luck. After being confused for a while with this problem and furiously Googling about it, I found this issue reported by @huww98 on the systemd repo: systemd/systemd#18445 "Support for autofs "strictexpire" option".

Thanks to their amazing debugging skills, they were able to trace the root cause of the issue to a statfs() syscall made by node_exporter on all mountpoints. If the scrape interval of your Prometheus instance is lower than the timeout, your mountpoint will never be unmounted. In their own words:

When using automount unit, I found the mount point is not unmounted when TimeoutIdleSec expires. I figured out that it is node-exporter filesystem collector that resets the last_used field in a statfs system call. This can be avoided by specifying "strictexpire" option.

The fix

As stated in the linked issue, the kernel provides a way to opt out of the mechanism that poorly interacts with node_exporter. The flag strictexpire will disable the reset of the last_used field for path walks. This might not be the ideal solution for you if your automount is frequently accessed without infrequent I/O, as it might cause frequent unmounts and re-mounts of the filesystem.

Since systemd 250+, it's now possible to pass extra flags to autofs.

If you have a manually created .automount unit

Add the following to the [Automount] section of your automoount unit:

[Automount]
ExtraOptions=strictexpire

Run systemctl daemon-reload and you're done!

If you are using /etc/fstab

There is no way to set ExtraOptions from within an fstab-defined automount. Please manually define an automount unit. Here's a way to do it:

# Assuming you have `x-systemd.automount` specified in your filesystem in /etc/fstab...
MOUNTPOINT=/mnt/your-mountpoint # ** put your automounted filesystem mountpoint here **

# 1. Save the contents of the current automount unit
unit_name="$(systemd-escape "${MOUNTPOINT:1}").automount"
systemctl cat $unit_name > $unit_name

# 2. Remove `x-systemd.automount` and `x-systemd.idle-timeout` from your /etc/fstab
$EDITOR /etc/fstab

# 3. Reload units to remove the existing automount
systemctl daemon-reload

# 4. Add `ExtraOptions=strictexpire` to the unit file (this assumes that the [Automount] section is the last, which is usually the case)
echo "ExtraOptions=strictexpire" >> $unit_name

# 5. Load the unit
mv $unit_name /etc/systemd/system/
systemctl daemon-reload
systemctl start $unit_name

If you're using autofs

You can specify the strictexpire option in your master map entry file: https://manpages.ubuntu.com/manpages/focal/man5/auto.master.5.html (man 5 auto.master)

Context

For the curious, here's a bit more context on the problem than present in the issue linked above.

Behind the scenes, systemd uses the autofs kernel filesystem to handle all the magic of automounting and simply acts as a daemon "intermediary" for the kernel. autofs handles expiration of mountpoints by maintaining a last_used timestamp on individual directory entries and symlinks. For symlinks, the timestamp is simply updated every time the symlink is followed. For directory entries, there are two mechanisms which are used:

  • autofs is polled by the managing daemon for expiration of the managed automounts. When this is done, autofs checks if the mountpoint is busy and if so, updates the last_used timestamp as needed. Unless you have processes actually opening files and doing I/O on your automount (in which case you don't want the filesystem to be unmounted), this will never be a problem.
  • autofs defaults to resetting the last_used timestamp on path walks (i.e., every time a path to/part of the mountpoint is resolved by the kernel). The underlying reason makes sense: if you're roaming around a directory but not doing any I/O that would mark the mountpoint as busy, you don't want the filesystem to disappear under yourself, so autofs conservatively prevents that from happening.

So where does node_exporter fit in this? As @huww98 discovered, node_exporter, when polling for disk information (the default), calls statfs() to obtain information about every mountpoint. Since statfs() works with path names, it triggers a path walk, which subsequently resets the last_used timestamp on the mountpoint. If your Prometheus scraping interval is set lower than your autofs expiry interval, node_exporter will effectively prevent your filesystem from being automatically unmounted.

@raven-au added more insightful context in the original issue: this issue doesn't happen with stat() syscalls because this was considered as part of the original autofs design, but the statfs call was missed. Since Linux almost never breaks userspace, this originally unplanned interaction has now become expected!

@huww98
Copy link

huww98 commented Feb 4, 2023

Do you mean statfs instead of statfd in the last paragraph?

@Robertof
Copy link
Author

Robertof commented Feb 4, 2023

Do you mean statfs instead of statfd in the last paragraph?

Oops sorry, fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment