Skip to content

Instantly share code, notes, and snippets.

@tendstofortytwo
Last active August 14, 2023 19:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tendstofortytwo/5bc7b158239b1216e338c45b56e6b9b1 to your computer and use it in GitHub Desktop.
Save tendstofortytwo/5bc7b158239b1216e338c45b56e6b9b1 to your computer and use it in GitHub Desktop.

pf Endpoint Independent NAT Support

Background

FreeBSD pf provides network address translation (NAT) functionality to allow multiple devices on an internal network to share the same public IP address. IETF RFC 47871 attempts to categorize NATs based on how they map internal IP:ports to external IP:ports. The two categories relevant for this document are:

  • Address and Port-Dependent Mapping (current pf behavior)

    The NAT reuses the port mapping for subsequent packets sent from the same internal IP address and port (X:x) to the same external IP address and port while the mapping is still active.1

    In this case, when a packet is received from an external IP:port X1:x1 to NAT address Y:y, it is forwarded to an internal client Z:z only if there is an active mapping from Z:z to Y:y for X1:x1. So another external IP:port X2:x2 will not be able to use Y:y to talk to the internal client.

     (External device 1 - X1:x1)             (External device 2 - X2:x2)
             ^
             +-------------------+
                                 |
                               (NAT - Y:y)
                                 ^
                                 |
                               (client - Z:z)
    

    Fig. 1 -- client at internal address Z:z sends packet to external device 1 at X1:x1, NAT creates mapping Y:y for external device 1 to reply to client.

     (External device 1 - X1:x1)             (External device 2 - X2:x2)
             |                                       |
             +-------------------+     +-------------+
                                 v     x
                               (NAT - Y:y)
                                 |
                                 v
                               (client - Z:z)
    

    Fig. 2 -- X1:x1 is able to use the Y:y mapping to send responses to Z:z, but a different external device at X2:x2 is not allowed to do that, since the Y:y mapping is particular to the tuple (X1:x1, Z:z).

  • Endpoint-Independent Mapping (behavior recommended by the RFC and this document)

    The NAT reuses the port mapping for subsequent packets sent from the same internal IP address and port (X:x) to any external IP address and port.1

    In this case, when a packet is received from an external IP:port X1:x1 to NAT address Y:y, it is forwarded to an internal client Z:z if there is any active mapping from Z:z to Y:y -- the mapping does not depend on the external device. So another external IP:port X2:x2 can reuse the same mapping Y:y to talk to the internal client.

     (External device 1 - X1:x1)             (External device 2 - X2:x2)
             ^
             +-------------------+
                                 |
                               (NAT - Y:y)
                                 ^
                                 |
                               (client - Z:z)
    

    Fig. 1 -- client at internal address Z:z sends packet to external device 1 at X1:x1, NAT creates mapping Y:y for anyone to be able to talk to to client.

     (External device 1 - X1:x1)             (External device 2 - X2:x2)
             |                                       |
             +-------------------+     +-------------+
                                 v     v
                               (NAT - Y:y)
                                 |     |
                                 v     v
                               (client - Z:z)
    

    Fig. 2 -- Both X1:x1 and X2:x2 are able to use the Y:y mapping to send responses to Z:z, since the Y:y mapping is only particular to Z:z and independent of any external X:x.

The latter allows for "NAT hole punching", ie. a client behind the NAT can coordinate with a STUN server to find its external IP:port, and share this with another party to establish a peer-to-peer connection. With the former, the other party is not guaranteed to be able to access the client since the mapping from client IP:port to external IP:port might change.

One of the applications of NAT hole punching is Tailscale, a VPN software that allows multiple clients on the same virtual network to communicate with each other over the internet. If possible, Tailscale will try to use NAT hole punching to establish a direct connection between peers. In case this is not possible (both clients are behind NATs that do not allow for hole-punching), Tailscale establishes the connection in a TCP tunnel through relay servers hosted by them.2

Passing traffic through these relay servers give a throughput penalty, as demonstrated below.

Testing

A FreeBSD 14.0-CURRENT system (manganese) was setup as an internet gateway, using pf to NAT traffic from a virtual machine to the internet. Tailscale was installed on the virtual machine (manganese-vm1), and another internet-connected machine (neon) behind a different address and port-dependent NAT.

+----------------------------+
|      (neon - X:41641)      |
|             |              |
| address-port-dependent NAT |
+----------------------------+
              | W:w
              |
           Internet
              |
              | Y:y
+---------------------------+
|         manganese         |
|             |             |
| (manganese-vm1 - Z:41641) |
+---------------------------+

Network performance was benchmarked by doing an rsync file transfer of a 50MB file from neon to manganese-vm1, and then the tailscale ping utility was used to check whether a direct connection was established between the machines or if a Tailscale relay server (Designated Encrypted Relay for Packets, aka "DERP") was being used.

When manganese was providing an address and port-dependent NAT:

root@manganese-vm1:~ # time rsync --info=progress2 nsood@neon:testfile.dat .
(nsood@neon.marmoset-monitor.ts.net) Password for nsood@neon:
     52,428,800 100%  606.48kB/s    0:01:24 (xfr#1, to-chk=0/1)
       88.02 real         0.73 user         6.56 sys
root@manganese-vm1:~ # tailscale ping neon
pong from neon (100.81.122.130) via DERP(nyc) in 62ms
pong from neon (100.81.122.130) via DERP(nyc) in 73ms
pong from neon (100.81.122.130) via DERP(nyc) in 50ms
pong from neon (100.81.122.130) via DERP(nyc) in 46ms
pong from neon (100.81.122.130) via DERP(nyc) in 54ms
pong from neon (100.81.122.130) via DERP(nyc) in 103ms
pong from neon (100.81.122.130) via DERP(nyc) in 59ms
pong from neon (100.81.122.130) via DERP(nyc) in 53ms
pong from neon (100.81.122.130) via DERP(nyc) in 48ms
pong from neon (100.81.122.130) via DERP(nyc) in 149ms
direct connection not established

This corresponds to the following connection graph:

+----------------------------+
|      (neon - X:41641)      |
|             |              |
| address-port-dependent NAT |
+----------------------------+
              | W:w
              v
   (Tailscale DERP server)
              ^
              | Y:y
+---------------------------+
|         manganese         |
|             |             |
| (manganese-vm1 - Z:41641) |
+---------------------------+

When manganese was providing an endpoint-independent NAT (using patch3):

root@manganese-vm1:~ # time rsync --info=progress2 nsood@neon:testfile.dat .
(nsood@neon.marmoset-monitor.ts.net) Password for nsood@neon:
     52,428,800 100%    2.12MB/s    0:00:23 (xfr#1, to-chk=0/1)
       27.07 real         0.85 user         1.08 sys
root@manganese-vm1:~ # tailscale ping neon
pong from neon (100.81.122.130) via 129.97.125.2:41641 in 18ms

This corresponds to the following connection graph:

+----------------------------+
|      (neon - X:41641)      |
|             |              |
| address-port-dependent NAT |
+----------------------------+
              | W:w
              |
              v Y:41641
+---------------------------+
|         manganese         |
|             |             |
| (manganese-vm1 - Z:41641) |
+---------------------------+

Using the endpoint-independent NAT allowed a direct connection to be established between manganese-vm1 and neon, which resulted in the file transfer finishing three times faster than using the relay server (27 seconds vs 88 seconds).

Conclusion

Having pf support endpoint-independent NAT allows clients to establish peer-to-peer connections over the internet, which brings significant performance improvements over using relay servers, as demonstrated.

Footnotes

  1. https://datatracker.ietf.org/doc/html/rfc4787 2 3

  2. https://tailscale.com/blog/how-tailscale-works/

  3. https://reviews.freebsd.org/D11137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment