Skip to content

Instantly share code, notes, and snippets.

@adamhooper
Created December 7, 2019 19:12
Show Gist options
  • Save adamhooper/9da9c93c023ddc008f9f7196d73dd19a to your computer and use it in GitHub Desktop.
Save adamhooper/9da9c93c023ddc008f9f7196d73dd19a to your computer and use it in GitHub Desktop.
Network-namespace a process so it can't access our internal network
import errno
import sys
from dataclasses import dataclass
import pyroute2
@dataclass(frozen=True)
class NetworkConfig:
"""
Network configuration that lets children access the Internet.
Pyspawner will create a veth interface that may be used to route traffic
from the child to the Internet via network address translation (NAT).
*You must write the iptables rules yourself! pyspawner does not invoke
iptables!* The intent is for you to set up iptables rules once, and then
reuse the same rules for every clone.
One iptables rule to route network traffic from a child process to the
Internet::
iptables -t nat -a POSTROUTING -s [child_ipv4_address] -j SNAT --to-source=[our IP address]
You should also firewall the traffic to secure the rest of your network
from sandboxed processes. See ``tests/setup-sandbox.sh`` for a minimal
set of iptables rules.
We do not yet support IPv6, because Kubernetes support is shaky. Follow
https://github.com/kubernetes/kubernetes/issues/62822.
Here's how networking works. When cloning, the child process gets a new,
anonymous network namespace. pyspawner creates a veth pair, and it passes
the "child" veth interface to the child process. The child process brings
up its network interface and can only see the public Internet.
After the child dies, the Linux kernel will delete the network interface.
(There's a bit of a race here: the interface may exist a few milliseconds
after the child dies. Pyspawner will explicitly ensure the interface is
deleted before creating it.)
Beware if running multiple children at once that all access the Internet.
Each must have a unique interface name and IP addresses.
The default values match those in `tests/setup-sandbox.sh`. Don't
edit one without editing the other.
"""
parent_veth_name: str = "veth-pyspawn"
"""
Name of veth interface run by the parent.
Maximum length is 15 characters. Any longer gives NetlinkError 34.
This name must not conflict with any other network device in the parent's
container.
"""
child_veth_name: str = "veth-pyspawn-c"
"""
Name of veth interface run by the child.
Maximum length is 15 characters. Any longer gives NetlinkError 34.
This name must not conflict with any other network device in the parent's
container. (The parent creates this device before sending it into the
child's network namespace.)
"""
parent_ipv4_address: str = "192.168.123.1"
"""
IPv4 address of the parent.
This must not conflict with any other IP address in the parent's container.
This should be a private address. Be sure it doesn't conflict with your
network's addresses. Kubernetes uses 10.0.0.0/8; Docker uses 172.16.0.0/12.
The hard-coded "192.168.123/24" should be safe for Docker and Kubernetes.
The child will use this address as its default gateway.
"""
child_ipv4_address: str = "192.168.123.2"
"""
IPv4 address of the child.
The parent's caller will maintain iptables rules to route from this IP to
the public Internet.
This must be in the same `/24` network block as `parent_ipv4_address`.
"""
def _setup_network_namespace_from_pyspawner(config: NetworkConfig, child_pid: int) -> None:
"""
Send new veth device to `child_pid`'s network namespace.
See `_network()` for the child's logic. Read the `NetworkConfig`
docstring to understand how the network namespace works.\
The child must wait for this to complete before it embarks upon its own
sandboxing adventure.
"""
with pyroute2.IPRoute() as ipr:
# Avoid a race: what if another forked process already created this
# interface?
#
# If that's the case, assume the other process has already exited
# (because [2019-11-11] we only run one networking-enabled child at a
# time). So the veth device is about to be deleted anyway.
try:
ipr.link("del", ifname=config.parent_veth_name)
except pyroute2.NetlinkError as err:
if err.code == errno.ENODEV:
pass # common case -- the device doesn't exist
else:
if err.code == errno.EPERM:
sys.stderr.write(
textwrap.dedent(
r"""
*** pyspawner failed to use netlink. ***
Are you using pyspawner in Docker? Docker
containers don't have CAP_NET_ADMIN by default. To
use pyspawner you'll need to relax this
restriction:
docker run \
--cap-add NET_ADMIN \
...
"""
)
)
raise
# Create parent_veth + child_veth veth pair
ipr.link(
"add",
ifname=config.parent_veth_name,
peer=config.child_veth_name,
kind="veth",
)
# Bring up parent_veth
parent_veth_index = ipr.link_lookup(ifname=config.parent_veth_name)[0]
ipr.addr(
"add",
index=parent_veth_index,
address=config.parent_ipv4_address,
prefixlen=24,
)
ipr.link("set", index=parent_veth_index, state="up")
# Send child_veth to child namespace
child_veth_index = ipr.link_lookup(ifname=config.child_veth_name)[0]
ipr.link("set", index=child_veth_index, net_ns_pid=child_pid)
def _install_network_in_child(config: NetworkConfig) -> None:
"""
Set up networking, assuming pyspawner passed us a network interface.
Set ip address of veth interface, then bring it up.
Also bring up the "lo" interface.
This requires CAP_NET_ADMIN. Use the "drop_capabilities" sandboxing step
afterwards to prevent further fiddling.
"""
with pyroute2.IPRoute() as ipr:
lo_index = ipr.link_lookup(ifname="lo")[0]
ipr.link("set", index=lo_index, state="up")
veth_index = ipr.link_lookup(ifname=config.child_veth_name)[0]
ipr.addr(
"add", index=veth_index, address=config.child_ipv4_address, prefixlen=24
)
ipr.link("set", index=veth_index, state="up")
ipr.route("add", gateway=config.parent_ipv4_address)
#!/bin/bash
# Just one time, the parent process set up a firewall so it can forward
# traffic from children when they get created.
#
# The firewall cites a network interface (veth-pyspawn) and child's IP
# address that carry no meaning until after we spawn a child.
set -e
# NetworkConfig mimics pyspawner/pyspawner/sandbox.py
KERNEL_VETH=veth-pyspawn
CHILD_VETH_IP4="192.168.123.2"
# iptables
# "ip route get 1.1.1.1" will display the default route. It looks like:
# 1.1.1.1 via 192.168.86.1 dev wlp2s0 src 192.168.86.70 uid 1000
# Grep for the "src x.x.x.x" part and store the "x.x.x.x"
ipv4_snat_source=$(ip route get 1.1.1.1 | grep -oe "src [^ ]\+" | cut -d' ' -f2)
cat << EOF | iptables-legacy-restore --noflush
*filter
:INPUT ACCEPT
:FORWARD DROP
# Block access to the host itself from a child.
-A INPUT -i $PARENT_VETH -j REJECT
# Allow forwarding response packets back to our module (even
# though our module's IP is in UNSAFE_IPV4_ADDRESS_BLOCKS).
-A FORWARD -o $PARENT_VETH -j ACCEPT
# Block unsafe destination addresses. Children should not be
# able to access internal services. (Not even our DNS server.)
-A FORWARD -d 0.0.0.0/8 -i $PARENT_VETH -j REJECT
-A FORWARD -d 10.0.0.0/8 -i $PARENT_VETH -j REJECT
-A FORWARD -d 100.64.0.0/10 -i $PARENT_VETH -j REJECT
-A FORWARD -d 127.0.0.0/8 -i $PARENT_VETH -j REJECT
-A FORWARD -d 169.254.0.0/16 -i $PARENT_VETH -j REJECT
-A FORWARD -d 172.16.0.0/12 -i $PARENT_VETH -j REJECT
-A FORWARD -d 192.0.0.0/24 -i $PARENT_VETH -j REJECT
-A FORWARD -d 192.0.2.0/24 -i $PARENT_VETH -j REJECT
-A FORWARD -d 192.88.99.0/24 -i $PARENT_VETH -j REJECT
-A FORWARD -d 192.168.0.0/16 -i $PARENT_VETH -j REJECT
-A FORWARD -d 198.18.0.0/15 -i $PARENT_VETH -j REJECT
-A FORWARD -d 198.51.100.0/24 -i $PARENT_VETH -j REJECT
-A FORWARD -d 203.0.113.0/24 -i $PARENT_VETH -j REJECT
-A FORWARD -d 224.0.0.0/4 -i $PARENT_VETH -j REJECT
-A FORWARD -d 240.0.0.0/4 -i $PARENT_VETH -j REJECT
-A FORWARD -d 255.255.255.255/32 -i $PARENT_VETH -j REJECT
# Allow forwarding exactly the source address of the child.
# Don't forward just any address (i.e. don't set policy
# ACCEPT): if a child somehow gains CAP_NET_ADMIN (which
# shouldn't happen) it should not be able to spoof source
# addresses.
-A FORWARD -i $PARENT_VETH -s $CHILD_VETH_IP4 -j ACCEPT
COMMIT
*nat
:POSTROUTING ACCEPT
-A POSTROUTING -s $CHILD_VETH_IP4 -j SNAT --to-source $ipv4_snat_source
COMMIT
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment