Skip to content

Instantly share code, notes, and snippets.

@kentonv kentonv/SCM_RIGHTS.md
Last active Apr 22, 2019

Embed
What would you like to do?
SCM_RIGHTS API quirks

As tested on Linux:

  • An SCM_RIGHTS ancillary message is "attached" to the range of data bytes sent in the same sendmsg() call.
  • However, as always, recvmsg() calls on the receiving end don't necessarily map 1:1 to sendmsg() calls. Messages can be coalesced or split.
  • The recvmsg() call that receives the first byte of the ancillary message's byte range also receives the ancillary message itself.
  • To prevent multiple ancillary messages being delivered at once, the recvmsg() call that receives the ancillary data will be artifically limited to read no further than the last byte in the range, even if more data is available in the buffer after that byte, and even if that later data is not actually associated with any ancillary message.
  • However, if the recvmsg() that received the first byte does not provide enough buffer space to read the whole message, the next recvmsg() will be allowed to read past the end of the mesage range and even into a new ancillary message's range, returning the ancillary data for the later message.
  • Regular read()s will show the same pattern of potentially ending early even though they cannot receive ancillary messages at all. This can mess things up when using edge triggered I/O if you assumed that a short read() indicates no more data is available.
  • A single SCM_RIGHTS message may contain up to SCM_MAX_FD (253) file descriptors.
  • If the recvmsg() does not provide enough ancillary buffer space to fit the whole descriptor array, it will be truncated to fit, with the remaining descriptors being discarded and closed. You cannot split the list over multiple calls.
@kentonv

This comment has been minimized.

Copy link
Owner Author

kentonv commented Mar 31, 2019

More stuff:

  • The byte range that the ancillary message is attached to cannot have zero size.

DANGER DANGER DANGER

  • If you call recvmsg() and provide space to receive an ancillary message at all -- even if you DIDN'T expect SCM_RIGHTS -- you MUST check if you received an SCM_RIGHTS message and, if so, close the file descriptors. Otherwise, an attacker can fill up your file descriptor table with garbage, probably DoSing you.
  • The CMSG_SPACE() macro is intended to help you decide how much space to allocate to receive an ancillary message. It rounds up its calculation to the next word boundary. Unfortunately, on 64-bit systems, this means you will always end up with enough space for an even number of file descriptors. If you were expecting just one FD, you'll end up with enough buffer space to receive two. You MUST check whether you received two and close the second one, otherwise, again, an attacker can fill up your FD table.
  • A single recvmsg() can in fact receive multiple messages, any of which could be SCM_RIGHTS. Don't forget to check for this.
@kentonv

This comment has been minimized.

Copy link
Owner Author

kentonv commented Apr 22, 2019

MORE DANGER: Some operating systems are buggy

  • Some operating systems are buggy in the case that the ancillary buffer space is too small to fit all received descriptors. Notably, on MacOS, the excess descriptors that were never delivered to the process via the ancillary message will nevertheless have been added to the process's file descriptor table, and so will not be closed. Until very recently, FreeBSD had this bug as well. In these cases, in order to avoid DoS attacks, it is necessary to provide a buffer that is larger than the maximum number of FDs that can be transferred at once. Unfortunately, this means you must temporarily accept all these FDs into your FD table, and then go close them. Also, the maximum number of FDs is not documented.
  • Additionally, on MacOS, cmsghdr.cmsg_len is allowed to overrun the underlying buffer space indicated by msghdr.msg_controllen. You must check for this and clamp otherwise you are likely to overrun the buffer. In the best case, you segfault. In the more-likely case, you end up calling close(0), which leads to madness. On Linux, the kernel automatically clamps cmsg_len so that it does not overrun. I have not tested any other kernels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.