armon/security.md

## security.md

      
    Raw
  

              security.md
            
          
    Serf Security Model

Relevant branch: https://github.com/hashicorp/memberlist/compare/f-encrypt
The security model used by Serf is designed to provide confidentiality, integrity and authentication.
Below is the threat model considered for the design of the model. The security model
is built on around a symmetric key, or shared secret system. All members of the Serf
cluster must be provided the shared secret ahead of time. This places the burden of key
distribution on the user.
To support confidentiality, all messages are encrypted using the
AES-128 standard. The
AES standard is considered one of the most secure and modern encryption standards.
Additionally, it is a fast algorithm, and modern CPUs provide hardware instructions to
make encryption and decryption very lightweight. Because AES works on block sizes of
16 bytes, we make use of the PKCS7 padding algorithm.
AES is used with the Galois Counter Mode (GCM),
using a randomly generated nonce. The use of GCM additionally provides message integrity,
as the ciphertext is suffixed with a 'tag' that is used to verify message integrity
before decryption.
Message Format

In the overview we describe the various crypto primitives that are used. In this
section we cover how messages are framed on the wire and interpretted to ensure
confidentiality, integrity and authentication are provided.
UDP Message Format

UDP messages do not require any framing since they are packet oriented. This
allows the message to be somewhat simpler and saves some space. The format is
as follows:
-------------------------------------------------------------------
| Version (byte) | Nonce (12 bytes) | CipherText | Tag (16 bytes) |
-------------------------------------------------------------------

The UDP message thus has a minimum overhead of 29 bytes, plus up to an additional
15 bytes of padding or 44 bytes. There is no length specified, since the UDP packet
is already framed. Tampering or bit corruption of any of the packet will cause the
GCM tag verification to fail.
Once we receive a packet, we first verify the GCM tag, and only on verification, continue
to decrypt the payload. The version byte is provided to allow future versions to
change the algorithm they use. It is currently always set to 0.
TCP Message Format

TCP provides a stream abstraction and therefor we must provide our own framing.
This is trickey as it is a potential attack vector. We cannon verify the tag
until the entire message is received, and we must be provided the length in
plaintext. Our current approach is to limit the maximum size of a framed message
to 10MB to prevent an enormous amount of data being sent causing a Denial of Service.
The wire format is as follows:
-------------------------------------------------------------------------------------------------------
| MsgType (byte) | Length (4 bytes) | Version (byte) | Nonce (12 bytes) | CipherText | Tag (16 bytes) |
-------------------------------------------------------------------------------------------------------

The TCP format is very similar to the UDP format, but it prepends the message with
a message type byte (similar to other Serf messages). It also adds a 4 byte length
field, encoded in Big Endian format. This increases its maximum overhead to 49 bytes.
When we first receive a TCP encrypted message, we check the message type. If any
party has encryption enabled, the other party must as well. Otherwise we are vulnerable
to a downgrade attack where one side can force the other into a non-encrypted mode of
operation.
Once this is verified, we determine the message length and if it is less than our
10MB limit, read in the rest of the message. The tag that is provided verifies
the entire payload, including the message type and length, ensuring that nothing has
been tampered with.
Threat Model

The following are the various aspects of our threat model:

Non-members getting access to events or membership information
Cluster state corruption due to malicious messages being processed
Fake event generation due to malicious messages
Tampering of messages causing state corruption
Denial of Service against a node

As with most security systems, no system is unbreakable.
Our goal is not to protect top secret data but to provide a "reasonable"
level of security that would require an attacker to commit a considerable
amount of resources to defeat.
It is worth mentioning that we are specifically not concerned about
replay attacks, as the gossip protocol is designed to handle that
due to the nature of its broadcast mechanism.
Future Consideration

Some considerations for the future are:

Using the Version field to change the algorithm we use
Supporting different algorithms via configuration
Supporting key rotation
Cluster membership can be inferred by observing random probing

Appendix


Node communication is modeled after the SWIM protocol

Periodic health probing over UDP of random nodes
Frequent gossip over UDP to random nodes
Infrequent state push/pull over TCP to random nodes