Network namespaces are like containers but only for isolating network interfaces, and firewall rules.
Here we shall use it contain a networked application to prevent it from accessing local and private networks, but allowing arbitrary outbound internet access. This demonstration is designed for IPv4 and not IPv6. With some changes, it can be IPv6 capable.
There is a small amount of overhead for network requests traversing the container boundary (a few to tens of milliseconds latency).
# assume the network namespace is called "container"
namespace="container"
# we also need a subnet to connect the host and the container
namespace_subnet="10.0.0.0"
# our subnet mask in CIDR notation
namespace_cidr="/24"
# we need a virtual ethernet pair connecting the host to namespace
# the host will hold the veth0 10.0.0.1, and the namespace will hold veth1 10.0.0.2
veth_host="veth0"
veth_host_ip="10.0.0.1"
veth_namespace="veth1"
veth_namespace_ip="10.0.0.2"
# internet connected interface on the host
internet_host="eth0"
# DNS addresses
dns_host="8.8.8.8"
# allow IPv4 packet forwarding by the kernel
sysctl net.ipv4.conf.all.forwarding=1
# create a new namespace
ip netns add $namespace
# create a namespace specific resolv.conf (the host resolv.conf might be using IPv6)
mkdir -p /etc/netns/$namespace
echo "nameserver $dns_host" > /etc/netns/$namespace/resolv.conf
# create the veth pair, it's like a virtual ethernet cable!
ip link add $veth_host type veth peer name $veth_namespace
# push the veth1 interface into the namespace
ip link set dev $veth_namespace netns $namespace
# bring up the host veth interface
ifconfig veth0 "${veth_host_ip}${namespace_cidr}" up
# bring up the namespace veth interface
ip netns exec $namespace ifconfig $veth_namespace "${veth_namespace_ip}${namespace_cidr}" up
# bring up the namespace localhost interface
ip netns exec $namespace ip link set dev lo up
# route packets going to unknown IP addresses inside the namespace to the host
# the gateway in this case is the veth0 that the host currently possesses
ip netns exec $namespace ip route add default via "$veth_host_ip"
# use firewall rules to route packets between the internet interface and veth0 on the host
# masquerade makes it so that packets going from namespace subnet will be made to seem as
# if the packet came from the internet_host
iptables -t nat -A POSTROUTING -s "${namespace_subnet}${namespace_cidr}" -o $internet_host -j MASQUERADE
iptables -A FORWARD -i $internet_host -o $veth_host -j ACCEPT
iptables -A FORWARD -o $internet_host -i $veth_host -j ACCEPT
# use firewall rules inside the namespace to allow packets going to the gateway and DNS
# but disallow packets heading to local and private networks
ip netns exec $namespace iptables -A OUTPUT -d "$veth_host_ip" -j ACCEPT
ip netns exec $namespace iptables -A OUTPUT -d "$dns_host" -j ACCEPT
ip netns exec $namespace iptables -A OUTPUT -d 127.0.0.0/8 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 10.0.0.0/8 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 172.16.0.0/12 -j DROP
ip netns exec $namespace iptables -A OUTPUT -d 192.168.0.0/16 -j DROP
Here are the relevant commands to check that everything is working:
# check that veth0 is setup with right IP addresses at root
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:d3:07:6c brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fed3:76c/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:55:bf:6b brd ff:ff:ff:ff:ff:ff
inet 172.28.128.3/24 brd 172.28.128.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe55:bf6b/64 scope link
valid_lft forever preferred_lft forever
7: veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether c2:1e:24:7a:24:e3 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/24 brd 10.0.0.255 scope global veth0
valid_lft forever preferred_lft forever
inet6 fe80::c01e:24ff:fe7a:24e3/64 scope link
valid_lft forever preferred_lft forever
# check that loopback and veth1 is setup with the right IP addresses at namespace
ip netns exec $namespace ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
6: veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 6e:87:58:7a:a0:fb brd ff:ff:ff:ff:ff:ff
inet 10.0.0.2/24 brd 10.0.0.255 scope global veth1
valid_lft forever preferred_lft forever
inet6 fe80::6c87:58ff:fe7a:a0fb/64 scope link
valid_lft forever preferred_lft forever
# check that veth subnet is setup at root
ip route
default via 10.0.2.2 dev eth0
10.0.0.0/24 dev veth0 proto kernel scope link src 10.0.0.1
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
172.28.128.0/24 dev eth1 proto kernel scope link src 172.28.128.3
# check that the gateway and veth subnet has been setup at namespace
ip netns exec $namespace ip route
default via 10.0.0.1 dev veth1
10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.2
# ping root -> namespace
ping -c 3 $veth_namespace_ip
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.042 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.052 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.308 ms
--- 10.0.0.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.042/0.134/0.308/0.123 ms
# ping root <- namespace
ip netns exec $namespace ping -c 3 $veth_host_ip
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.065 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.056 ms
--- 10.0.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.053/0.058/0.065/0.005 ms
# ping external ip, test gateway and NAT
ip netns exec $namespace ping -c 3 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=53 time=28.1 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=53 time=28.4 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=53 time=27.8 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 27.824/28.138/28.427/0.313 ms
# ping external domain, test DNS, gateway and NAT
ip netns exec $namespace ping -c 3 google.com
PING google.com (150.101.161.160) 56(84) bytes of data.
64 bytes from 150.101.161.160: icmp_seq=1 ttl=57 time=30.7 ms
64 bytes from 150.101.161.160: icmp_seq=2 ttl=57 time=26.6 ms
64 bytes from 150.101.161.160: icmp_seq=3 ttl=57 time=32.7 ms
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 26.677/30.052/32.738/2.521 ms
# ping external domain, test DNS, gateway, NAT and the firewall
# these pings should fail
ip netns exec $namespace ping -c 3 10.1.1.8.xip.io
ip netns exec $namespace ping -c 3 127.0.0.1.xip.io
# show that root filter table has forwarding rules setup
iptables -nvL FORWARD
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
29 2621 ACCEPT all -- eth0 veth0 0.0.0.0/0 0.0.0.0/0
29 2353 ACCEPT all -- veth0 eth0 0.0.0.0/0 0.0.0.0/0
# show that root nat table has postrouting rules setup
iptables -t nat -nvL POSTROUTING
Chain POSTROUTING (policy ACCEPT 275 packets, 21058 bytes)
pkts bytes target prot opt in out source destination
18 1429 MASQUERADE all -- * eth0 10.0.0.0/24 0.0.0.0/0
# show the namespace filter table has output rules setup
ip netns exec $namespace iptables -nvL OUTPUT
Chain OUTPUT (policy ACCEPT 16 packets, 1344 bytes)
pkts bytes target prot opt in out source destination
11 924 ACCEPT all -- * * 0.0.0.0/0 10.0.0.1
13 1009 ACCEPT all -- * * 0.0.0.0/0 10.0.2.3
0 0 DROP all -- * * 0.0.0.0/0 127.0.0.0/8
0 0 DROP all -- * * 0.0.0.0/0 10.0.0.0/8
0 0 DROP all -- * * 0.0.0.0/0 172.16.0.0/12
0 0 DROP all -- * * 0.0.0.0/0 192.168.0.0/16
# show processes currently listening on TCP at namespace
ip netns exec $namespace ss --listening --numeric --tcp --processes
# create a simple webserver inside the namespace
ip netns exec $namespace socat \
-v -d -d \
TCP-LISTEN:1234,crlf,reuseaddr,fork \
SYSTEM:"
echo HTTP/1.1 200 OK;
echo Content-Type\: text/plain;
echo;
echo \"Server: \$SOCAT_SOCKADDR:\$SOCAT_SOCKPORT\";
echo \"Client: \$SOCAT_PEERADDR:\$SOCAT_PEERPORT\";
"
curl 10.0.0.2:1234
We cannot use iptables
, we would need to use ip6tables
or nftables
.
Even though loading iptables can be atomic using the file load and save of the iptables exported file, we cannot make it atomic with namespace configuration which uses ip netns commands. These are totally 2 different programs. You can kind of make it transactional by using a lock. Start a lock, then create the namespace and veth pairs. If any error occurs, do a cleanup.
Then and only then, load the iptables configuation in one atomic moment. If any errors occur, do a cleanup. But there's doesn't seem a be a good way to rollback iptable changes. The only way is to keep the state of the iptables previously, save that, then load the new rules. But this means we would need to add rules directly into iptable file, so we need to understand the file format and specs. Alternatively, we should investigate what atomicity guarantees that nftables has in comparison to iptables. But in the end we need to handle cross-command atomicity, and proper cleanup procedures.