CMCDragonkai/network_namespaces.md

## network_namespaces.md

      
    Raw
  

              network_namespaces.md
            
          
    Network Namespaces

Network namespaces are like containers but only for isolating network interfaces,
and firewall rules.
Here we shall use it contain a networked application to prevent it from accessing
local and private networks, but allowing arbitrary outbound internet access. This
demonstration is designed for IPv4 and not IPv6. With some changes, it can be IPv6
capable.
There is a small amount of overhead for network requests traversing the container
boundary (a few to tens of milliseconds latency).
# assume the network namespace is called "container"
namespace="container"

# we also need a subnet to connect the host and the container 
namespace_subnet="10.0.0.0"

# our subnet mask in CIDR notation
namespace_cidr="/24"

# we need a virtual ethernet pair connecting the host to namespace
# the host will hold the veth0 10.0.0.1, and the namespace will hold veth1 10.0.0.2
veth_host="veth0"
veth_host_ip="10.0.0.1"
veth_namespace="veth1"
veth_namespace_ip="10.0.0.2"

# internet connected interface on the host
internet_host="eth0"

# DNS addresses
dns_host="8.8.8.8"

# allow IPv4 packet forwarding by the kernel
sysctl net.ipv4.conf.all.forwarding=1

# create a new namespace
ip netns add $namespace

# create a namespace specific resolv.conf (the host resolv.conf might be using IPv6)
mkdir -p /etc/netns/$namespace
echo "nameserver $dns_host" > /etc/netns/$namespace/resolv.conf

# create the veth pair, it's like a virtual ethernet cable!
ip link add $veth_host type veth peer name $veth_namespace

# push the veth1 interface into the namespace
ip link set dev $veth_namespace netns $namespace

# bring up the host veth interface
ifconfig veth0 "${veth_host_ip}${namespace_cidr}" up

# bring up the namespace veth interface
ip netns exec $namespace ifconfig $veth_namespace "${veth_namespace_ip}${namespace_cidr}" up

# bring up the namespace localhost interface
ip netns exec $namespace ip link set dev lo up 

# route packets going to unknown IP addresses inside the namespace to the host
# the gateway in this case is the veth0 that the host currently possesses
ip netns exec $namespace ip route add default via "$veth_host_ip"

# use firewall rules to route packets between the internet interface and veth0 on the host
# masquerade makes it so that packets going from namespace subnet will be made to seem as 
# if the packet came from the internet_host
iptables -t nat -A POSTROUTING -s "${namespace_subnet}${namespace_cidr}" -o $internet_host -j MASQUERADE
iptables -A FORWARD -i $internet_host -o $veth_host -j ACCEPT
iptables -A FORWARD -o $internet_host -i $veth_host -j ACCEPT

# use firewall rules inside the namespace to allow packets going to the gateway and DNS
# but disallow packets heading to local and private networks
ip netns exec $namespace iptables -A OUTPUT -d "$veth_host_ip" -j ACCEPT
ip netns exec $namespace iptables -A OUTPUT -d "$dns_host" -j ACCEPT
ip netns exec $namespace iptables -A OUTPUT -d 127.0.0.0/8 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 10.0.0.0/8 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 172.16.0.0/12 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 192.168.0.0/16 -j DROP
Testing

Here are the relevant commands to check that everything is working:
# check that veth0 is setup with right IP addresses at root
ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:d3:07:6c brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fed3:76c/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:55:bf:6b brd ff:ff:ff:ff:ff:ff
    inet 172.28.128.3/24 brd 172.28.128.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe55:bf6b/64 scope link
       valid_lft forever preferred_lft forever
7: veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether c2:1e:24:7a:24:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.1/24 brd 10.0.0.255 scope global veth0
       valid_lft forever preferred_lft forever
    inet6 fe80::c01e:24ff:fe7a:24e3/64 scope link
       valid_lft forever preferred_lft forever
# check that loopback and veth1 is setup with the right IP addresses at namespace
ip netns exec $namespace ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
6: veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 6e:87:58:7a:a0:fb brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.2/24 brd 10.0.0.255 scope global veth1
       valid_lft forever preferred_lft forever
    inet6 fe80::6c87:58ff:fe7a:a0fb/64 scope link
       valid_lft forever preferred_lft forever
# check that veth subnet is setup at root
ip route

default via 10.0.2.2 dev eth0
10.0.0.0/24 dev veth0  proto kernel  scope link  src 10.0.0.1
10.0.2.0/24 dev eth0  proto kernel  scope link  src 10.0.2.15
172.28.128.0/24 dev eth1  proto kernel  scope link  src 172.28.128.3
# check that the gateway and veth subnet has been setup at namespace
ip netns exec $namespace ip route

default via 10.0.0.1 dev veth1
10.0.0.0/24 dev veth1  proto kernel  scope link  src 10.0.0.2
# ping root -> namespace
ping -c 3 $veth_namespace_ip

PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.042 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.052 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.308 ms

--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.042/0.134/0.308/0.123 ms
# ping root <- namespace
ip netns exec $namespace ping -c 3 $veth_host_ip

PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.065 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.056 ms

--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.053/0.058/0.065/0.005 ms
# ping external ip, test gateway and NAT
ip netns exec $namespace ping -c 3 8.8.8.8

PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=53 time=28.1 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=53 time=28.4 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=53 time=27.8 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 27.824/28.138/28.427/0.313 ms
# ping external domain, test DNS, gateway and NAT
ip netns exec $namespace ping -c 3 google.com

PING google.com (150.101.161.160) 56(84) bytes of data.
64 bytes from 150.101.161.160: icmp_seq=1 ttl=57 time=30.7 ms
64 bytes from 150.101.161.160: icmp_seq=2 ttl=57 time=26.6 ms
64 bytes from 150.101.161.160: icmp_seq=3 ttl=57 time=32.7 ms

--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 26.677/30.052/32.738/2.521 ms
# ping external domain, test DNS, gateway, NAT and the firewall
# these pings should fail
ip netns exec $namespace ping -c 3 10.1.1.8.xip.io
ip netns exec $namespace ping -c 3 127.0.0.1.xip.io
# show that root filter table has forwarding rules setup
iptables -nvL FORWARD

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
   29  2621 ACCEPT     all  --  eth0   veth0   0.0.0.0/0            0.0.0.0/0
   29  2353 ACCEPT     all  --  veth0  eth0    0.0.0.0/0            0.0.0.0/0
# show that root nat table has postrouting rules setup
iptables -t nat -nvL POSTROUTING

Chain POSTROUTING (policy ACCEPT 275 packets, 21058 bytes)
 pkts bytes target     prot opt in     out     source               destination
   18  1429 MASQUERADE  all  --  *      eth0    10.0.0.0/24          0.0.0.0/0
# show the namespace filter table has output rules setup
ip netns exec $namespace iptables -nvL OUTPUT

Chain OUTPUT (policy ACCEPT 16 packets, 1344 bytes)
 pkts bytes target     prot opt in     out     source               destination
   11   924 ACCEPT     all  --  *      *       0.0.0.0/0            10.0.0.1
   13  1009 ACCEPT     all  --  *      *       0.0.0.0/0            10.0.2.3
    0     0 DROP       all  --  *      *       0.0.0.0/0            127.0.0.0/8
    0     0 DROP       all  --  *      *       0.0.0.0/0            10.0.0.0/8
    0     0 DROP       all  --  *      *       0.0.0.0/0            172.16.0.0/12
    0     0 DROP       all  --  *      *       0.0.0.0/0            192.168.0.0/16
# show processes currently listening on TCP at namespace
ip netns exec $namespace ss --listening --numeric --tcp --processes
# create a simple webserver inside the namespace
ip netns exec $namespace socat \
    -v -d -d \
    TCP-LISTEN:1234,crlf,reuseaddr,fork \
    SYSTEM:"
        echo HTTP/1.1 200 OK; 
        echo Content-Type\: text/plain; 
        echo; 
        echo \"Server: \$SOCAT_SOCKADDR:\$SOCAT_SOCKPORT\";
        echo \"Client: \$SOCAT_PEERADDR:\$SOCAT_PEERPORT\";
    "

curl 10.0.0.2:1234
Extending to IPv6

We cannot use iptables, we would need to use ip6tables or nftables.
Making it atomic

Even though loading iptables can be atomic using the file load and save of the
iptables exported file, we cannot make it atomic with namespace configuration
which uses ip netns commands. These are totally 2 different programs. You can
kind of make it transactional by using a lock. Start a lock, then create the
namespace and veth pairs. If any error occurs, do a cleanup.
Then and only then, load the iptables configuation in one atomic moment. If any
errors occur, do a cleanup. But there's doesn't seem a be a good way to rollback
iptable changes. The only way is to keep the state of the iptables previously,
save that, then load the new rules. But this means we would need to add rules
directly into iptable file, so we need to understand the file format and specs.
Alternatively, we should investigate what atomicity guarantees that nftables has
in comparison to iptables. But in the end we need to handle cross-command
atomicity, and proper cleanup procedures.