Skip to content

Instantly share code, notes, and snippets.

@artizirk
Last active June 11, 2023 22:58
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save artizirk/0d800be97bcdb35fb7bfd9755208e0e8 to your computer and use it in GitHub Desktop.
Save artizirk/0d800be97bcdb35fb7bfd9755208e0e8 to your computer and use it in GitHub Desktop.
systemd-nspawn container architecture https://wiki.wut.ee/en/sysadmin/systemd-nspawn_containers

systemd-nspawn container architecture

This short document will show how to turn systemd-nspawn into a usable containeration system.

Those instructions here should work under Arch Linux and Debian 10 (Buster)

Host requirements

  • systemd-nspawn and machinectl (systemd-container package under Debian)
  • dnsmasq
  • debootstrap

Setup

  1. We need to create a network bridge device where systemd-nspawn can connect all the containers. Using systemd-networkd you can create those two files under /etc/systemd/network/.

    /etc/systemd/network/br0.netdev

     # Tell systemd-networkd to create a bridge device
     [NetDev]
     Name=br0
     Kind=bridge
    

    /etc/systemd/network/br0.network

     # Configure ip address for the bridge
     [Match]
     Name=br0
     
     [Network]
     Address=172.23.0.1/24
     LinkLocalAddressing=yes
     IPMasquerade=yes
     LLDP=yes
     EmitLLDP=customer-bridge
    

    After creating those configuration files and running command below you should be good.

     systemctl enable --now systemd-networkd
    

    If you need to make any changes to the network configuration then you can just restart systemd-networkd to reload the config.

     systemctl restart systemd-networkd
    
  2. So that all of our containers would always get a static ip address we are gona use dnsmasq.

Next we are going to tell dnsmasq to do static leases based on container hostname. In this config file im using 172.23.0.1/24 as the container network.

`/etc/dnsmasq.conf`

    domain=.local
    #no-poll # don't constanly poll /etc/resolv.conf
    #resolv-file=/etc/resolv.conf
    no-resolv
    server=8.8.8.8
    server=8.8.4.4
    domain-needed
    bogus-priv
    listen-address=127.0.0.1,172.23.0.1
    # Force bind only on interfaces that are listed in listen-address, allows systemd-resolve to work at the same time
    bind-dynamic
    dhcp-range=172.23.0.100,172.23.0.200,255.255.255.0,12h
    
    # When systemd-networkd or some other dhcp client requests for a ip address
    # they will be given a address based on their hostname.
    # This section can be moved to seperate file using dhcp-hostsfile option
    dhcp-host=http,172.23.0.2
    dhcp-host=mail,172.23.0.3
    dhcp-host=worpress,172.23.0.4
    dhcp-host=sql,172.23.0.5
  1. systemd-nspawn@.service must be told to use the br0 bridge For that we have to edit ExecStart line in the service file and add --network-bridge=br0 to the end of it.

    First get the currently used ExecStart value by running

     systemctl cat systemd-nspawn@.service
    

    After that we need to create a override.conf file that changes that file. You can do that either by using command systemctl edit systemd-nspawn@.service or by manualy creating a file called /etc/systemd/system/systemd-nspawn@.service.d/override.conf

    Contents of the override file should be something like this:

     [Service]
     ExecStart=
     ExecStart=<current execstart line> --network-bridge=br0
    

    For Arch Linux the contents would be

     [Service]
     ExecStart=
     ExecStart=/usr/bin/systemd-nspawn --quiet --keep-unit --boot --link-journal=try-guest --network-veth -U --settings=override --machine=%i --network-bridge=br0
    

    NB: nspawn service by default adds -U argument that turns on private users support and shifts all UID/GID-s up some random amount. If you plan on sharing files between containers then this will mess up yout file owners.

    You can enable mymachines nsswitch module that will do user and group id translation between host and container private users. https://www.freedesktop.org/software/systemd/man/nss-mymachines.html

Creating containers

That part is quite simple, just create a folder /var/lib/machines/<name> where name is your container hostname and deboostrap into it. machinectl start <name> should boot your container up and after staring systemd-networkd you should have a working network.

I have created some scripts automate container creation part

  • create_arch_container.sh - Creates a Arch Linux container

  • create_deb_container.sh - Creates a Ubuntu 16.04 container, but you can change it quite easily by changing SUITE argument from xenial to something else

  • create_container.sh - Creates a Debian 10 container on a ZFS pool, also configures locale and timezone inside the container.

    NB: debootstrap copies your host /etc/hostname file into the container, remove it or replace it with your container hostname

Administrating nspawn containers

  • machinectl start <name> - Boot container named name

  • machinectl poweroff <name> - Poweroff the container

  • machinectl shell <name> [/bin/sh] - Run a command in the container, by default the command is /bin/sh. For Debian/Ubuntu containers you probably want to tell it to use /bin/bash

  • machinectl restart <name> - this will probably break nspawn because on startup it will try to register with machined and fail because it takes some time to release a container name in machined. (there is bug report about it somewhere)

  • Port forwarding If there is a file /etc/systemd/nspawn/<name>.nspawn then systemd-nspawn will load additional configuration options from it.

    To do port forwarding you can do something like this:

      # /etc/systemd/nspawn/http.nspawn 
      [Network]
      Port=tcp:80:80
      Port=tcp:443:443
    

    You can add many more options from systemd.nspawn(5) man page

Things that are broken under Debian

Debian 10 (Buster) defaults to modern nftables but systemd still uses legacy iptables framework for firewall configuration. Don't mix those two.

@plaes
Copy link

plaes commented Jan 24, 2018

Tried out your approach with Debian Stretch (current stable):

  • Debian bug #787480 has been fixed
  • Standard systemd unit for dnsmasq seems to work ok (or just use override?)
  • machinectl reboot <container> issue is systemd/systemd#2809

Also, one thing worth mentioning that the /etc/hostname needs to be cleaned from the containers (or updated with proper container hostname), because at least debootstrap writes same hostname as the host has and dnsmasq does not give out static IP for the machine.

@artizirk
Copy link
Author

Things work mostly the same with Debian 10 Buster, i have added bind-dynamic do dnsmasq that allows for systemd-resolve to coexist with it and Debian 10 defaults to nftables but systemd uses legacy iptables. Standard dnsmasq service file works.

@artizirk
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment