Skip to content

Instantly share code, notes, and snippets.

@Jip-Hop
Last active August 27, 2024 16:40
Show Gist options
  • Save Jip-Hop/4704ba4aa87c99f342b2846ed7885a5d to your computer and use it in GitHub Desktop.
Save Jip-Hop/4704ba4aa87c99f342b2846ed7885a5d to your computer and use it in GitHub Desktop.
Persistent Debian 'jail' on TrueNAS SCALE to install software (docker-compose, portainer, podman, etc.) with full access to all files via bind mounts. Without modifying the host OS at all thanks to systemd-nspawn!
@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

Thanks for reporting this @abe520! I tested in a clean VM with TrueNAS-SCALE-23.10-MASTER-20230113-020709 and encountered the same issue.

I was able to find the culprit in these logs: journalctl | grep docker.

failed to start daemon: Error initializing network controller: error obtaining controller instance: Enabling IP forwarding failed: open /proc/sys/net/ipv4/ip_forward: read-only file system

Docker needs IP forwarding to be enabled, but can't enable this in the jail because /proc/sys/net/ipv4/ip_forward is read only (in the jail). A quick fix would be to run echo 1 > /proc/sys/net/ipv4/ip_forward before starting the jail (would have to be done on each boot).

When you've done this you should be able to use docker on cobia nightly. Let me know how it goes. 🙂

Edit: in order to permanently enable IP Forwarding you could follow these steps (source):

System Setting -- Advanced -- Sysctl -- Add
Variable = net.ipv4.ip_forward
Value = 1
Description = Enable IP forwarding

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

The script has been updated to allow creating jails with their own IP address by using the macvlan option. A macvlan interface is a virtual interface that adds a second MAC address to an existing physical Ethernet link. The IP address is obtained automatically via DHCP. The wizard will ask you to specify an ethernet network interface to create a macvlan interface from, which you can leave blank if you wish to use host networking instead.

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

@Ixian

This looks interesting regarding Nvidia GPU support: https://github.com/NVIDIA/libnvidia-container#command-line-example. nvidia-container-cli is preinstalled on TrueNAS SCALE. Looks like this can be used to 'prepare' the rootfs of a container so nvidia GPU can be used (ensures host version of the driver is matched). I think this is also what LXD does. But I can't try it since I don't have a GPU...

What does nvidia-container-cli info and nvidia-container-cli list output?

By the way, since the macvlan option in the script there's no need any more to add a second IP address (alias) to the network interface from the TrueNAS SCALE interface. I prefer this since now I can leave DHCP enabled for both the server and the jails. 😄

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

nvidia-container-cli list --help
Usage: nvidia-container-cli list [OPTION...]
Query the driver and list the components required in order to configure a
container with GPU support.

 Options:
  -?, --help                 Give this help list
  -b, --binaries             List driver binaries
      --compat32             Enable 32bits compatibility
  -d, --device=ID            Device UUID(s) or index(es) to list
  -f, --firmwares            List driver firmwares
  -i, --ipcs                 List driver ipcs
  -l, --libraries            List driver libraries
      --mig-config=ID        MIG devices to list config capabilities files for
      --mig-monitor=ID       MIG devices to list monitor capabilities files for
                            
      --usage                Give a short usage message
  -V, --version              Print program version

This is promising!

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

It would be great if anyone with an nvidia GPU could try the following:

git clone -b feature-mount-nvidia --single-branch --depth 1 https://gist.github.com/Jip-Hop/4704ba4aa87c99f342b2846ed7885a5d jailmaker

Then run jailmaker.sh and choose to install docker and mount nvidia drivers.

Please check if hardware acceleration works inside a docker container in the jail. 😃

@abe520
Copy link

abe520 commented Jan 14, 2023

Thanks for reporting this @abe520! I tested in a clean VM with TrueNAS-SCALE-23.10-MASTER-20230113-020709 and encountered the same issue.

I was able to find the culprit in these logs: journalctl | grep docker.

failed to start daemon: Error initializing network controller: error obtaining controller instance: Enabling IP forwarding failed: open /proc/sys/net/ipv4/ip_forward: read-only file system

Docker needs IP forwarding to be enabled, but can't enable this in the jail because /proc/sys/net/ipv4/ip_forward is read only (in the jail). A quick fix would be to run echo 1 > /proc/sys/net/ipv4/ip_forward before starting the jail (would have to be done on each boot).

When you've done this you should be able to use docker on cobia nightly. Let me know how it goes. 🙂

Edit: in order to permanently enable IP Forwarding you could follow these steps (source):

System Setting -- Advanced -- Sysctl -- Add
Variable = net.ipv4.ip_forward
Value = 1
Description = Enable IP forwarding

I'm sorry. i have broken my ssd.I can't continue the test for a few days.
Can I use a passthrough nic or SR-IOV to run the jail?

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 14, 2023

I don't know about SR-IOV but you can assign a physical NIC to the jail. Look through the networking options. You can pass these as additional flags to jailmaker.sh. Good luck with the SSD!

@injeolmibingsoo
Copy link

Glad to hear you want to vote. 😄 Do yo already have an account for Jira? You can login or sign up at the top right. When you've logged in you'll be able to vote (look for the thumbs up button, also in the top right).

Screenshot 2023-01-13 at 11 14 07

Please spread the word haha!

Voted! I was logged in but it turned out I needed to be accepted into iXsystems first.

@qudiqudi
Copy link

This looks very cool. Can anyone share some experience regarding stability? Do these jails survive unscheduled reboots (power outage)? What about the mounted data, in light of the recent community outcry about host path verification, is systemd-nspawn requiring similar structured paths when nfs/smb sharing comes into play?

@sprint1849
Copy link

sprint1849 commented Jan 25, 2023

Hi, thank you for making this. I manage to test this and is working fine. However, I just cant wrap my head around it on how to do vlan in here.
Im trying to attach the jail to vlan I created on the TrueNAS Scale, and entering the interface name to the macvlan in your script, I think its not working.

Another one Im trying to do is entering my host interface in to macvlan setting, it works, I can get ip from dhcp. But with this setup, I cant vlan my apps on docker. Its so complicated for me.

What is the most efficient way to do this? I just want to segregate my apps thru vlan. This is the only thing Im staying away from TrueNAS Scale's Kurbernetes, it doesn't seem possible to segregate apps thru vlan.

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 25, 2023

@qudiqudi Sorry I don't know what you mean with regards to host path verification and nfs/smb. Can you elaborate?

@sprint1849 Why do you want to do VLAN in there? Can't you just use host networking and use docker to make separate networks for your docker apps? That's what I do, with docker compose (Portainer). Are you saying this doesn't work when you're using the macvlan option of systemd-nspawn?

Thanks for testing!

@qudiqudi
Copy link

Sorry I don't know what you mean with regards to host path verification and nfs/smb. Can you elaborate?

Sure, it's essentially a measure to prevent data loss and permission issues, and was implemented on TrueNAS SCALE 22.12 Bluefin. Although it is present even back on CORE with iocage jails, it now came to light with Bluefin because iX is now forcing you to activate host path verification by default and doesn't give support to users that have it disabled.
It basically ensures that a mount path is only used by one app (Kubernetes) at a time. This means the same path cannot be used by a SMB or NFS share. For plex users i.e. this is very unfortunate. You can't access your plex library via smb now. There are ways to circumvent this but it's always a compromise.

@Ixian
Copy link

Ixian commented Jan 25, 2023

Sorry I don't know what you mean with regards to host path verification and nfs/smb. Can you elaborate?

Sure, it's essentially a measure to prevent data loss and permission issues, and was implemented on TrueNAS SCALE 22.12 Bluefin. Although it is present even back on CORE with iocage jails, it now came to light with Bluefin because iX is now forcing you to activate host path verification by default and doesn't give support to users that have it disabled. It basically ensures that a mount path is only used by one app (Kubernetes) at a time. This means the same path cannot be used by a SMB or NFS share. For plex users i.e. this is very unfortunate. You can't access your plex library via smb now. There are ways to circumvent this but it's always a compromise.

This isn't a problem with either the "native" docker workaround or this solution here which leverages a SystemD jail. I just mount my entire data directory to the jail and my containers running inside the jail can access everything I allow them to fine.

@qudiqudi
Copy link

This isn't a problem with either the "native" docker workaround or this solution here which leverages a SystemD jail. I just mount my entire data directory to the jail and my containers running inside the jail can access everything I allow them to fine.

The question is not whether they are allowed to but rather if it carries any risks to the data due to permission issues. Are ACLs on the host datasets overwritten by jails that access the same data?

@sprint1849
Copy link

sprint1849 commented Jan 26, 2023

@qudiqudi Sorry I don't know what you mean with regards to host path verification and nfs/smb. Can you elaborate?

@sprint1849 Why do you want to do VLAN in there? Can't you just use host networking and use docker to make separate networks for your docker apps? That's what I do, with docker compose (Portainer). Are you saying this doesn't work when you're using the macvlan option of systemd-nspawn?

Thanks for testing!

Hi, thanks for the suggestion. So I went to your suggestion to use host networking, and from then on, vlan my apps inside the jail using the host vlan interface. Its working great. The other thing I mentioned that doesn't work is this setup:
host vlan interface entered in the script's macvlan

Before I'll use this for production, I have some observations to share, and I need your advice.

In this setup, I notice that docker daemon is not started in jail if the host unselected pool in the apps settings. So I wonder if the jail still depends on host docker?

How safe is this jail thing? Sorry, I dont understand much the system. It seems quite powerful when using jail shell.

Stability? I guess this will be answered by us testers.

Will this survive future updates especially when truenas moves to containerd?

I like the implimentation that this doesn't use vm and could bind dataset. Kubernetes is so complicated for me. I just want segregated apps thru vlan, and this is easily achievable with your script thanks to you, using docker with portainer is possible in truenas scale without much hacking.

Edit:
I'd like to ask if its possible to impliment a vlan per jails in your script? Where you create vlan in truenas network, and attach the jails to such vlans?

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 28, 2023

@Ixian Do you also see these warnings when running docker info inside the jail (when using host networking)?

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

And were you able to confirm that this method of mounting the nvidia drivers from the host works? 😄

I started over with a new version of the script. I think it's turning out quite nice. 🙂 Will post as a proper git repo with readme soon.

@qudiqudi
Copy link

Looking forward to that git. This has huge potential. Works pretty well for me so far.

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 28, 2023

Here's the new script. Looking forward to your feedback!

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 29, 2023

@sprint1849

In this setup, I notice that docker daemon is not started in jail if the host unselected pool in the apps settings. So I wonder if the jail still depends on host docker?

It doesn't depend on the host docker. But you may be running into this issue.

Please check out the new script. It works a bit differently than the one in this gist, but has better documentation. Should also fix the above issue.

How safe is this jail thing? Sorry, I dont understand much the system. It seems quite powerful when using jail shell.

The jail has its own root filesystem and you can install packages in it without conflicting with packages on the host. But the jail also share resources, the kernel, bind-mounts, networking etc. with the host. The jail is not fully isolated. So I can't really answer this question without knowing what you want it to be safe from. 😄

Will this survive future updates especially when truenas moves to containerd?

Yes! That's kind of the point of jailmaker. 🙂

Edit:
I'd like to ask if its possible to impliment a vlan per jails in your script? Where you create vlan in truenas network, and attach the jails to such vlans?

I don't think there's a need to implement anything as jailmaker allows you to use additional systemd-nspawn options. Have a look at the Networking Options. Perhaps my Advanced Networking notes are helpful to you too.

@Jip-Hop
Copy link
Author

Jip-Hop commented Jan 29, 2023

@qudiqudi

The question is not whether they are allowed to but rather if it carries any risks to the data due to permission issues. Are ACLs on the host datasets overwritten by jails that access the same data?

Sorry I don't know the answer to that. I don't use ACLs on the files I mount inside the jail. Perhaps you could try it out and report your findings?

@Ixian
Copy link

Ixian commented Jan 29, 2023

@Ixian Do you also see these warnings when running docker info inside the jail (when using host networking)?

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

And were you able to confirm that this method of mounting the nvidia drivers from the host works? 😄

I started over with a new version of the script. I think it's turning out quite nice. 🙂 Will post as a proper git repo with readme soon.

I was not getting those warnings no.

Going to try your new script in a bit :)

@Jip-Hop
Copy link
Author

Jip-Hop commented Feb 4, 2023

Jailmaker is now written in Python instead of bash.

@qudiqudi
Copy link

qudiqudi commented Feb 5, 2023

@Jip-Hop just switched to the python version. Works very well! Thanks!
Why did you decide against the option to autom. install docker in the install wizard? Or are you planning on putting it back in?

@Jip-Hop
Copy link
Author

Jip-Hop commented Feb 5, 2023

Thanks for letting me know :) Have you tried what happens with ACLs?

Jailmaker initially installed just debian bullseye, so it was easy for me to implement the official steps to install docker. But now you may choose from different distros. In order for me to automate installing docker on those distros I'd basically have to write code which does whatever get.docker.com does, or I could run the get.docker.com script from Jailmaker. But docker doesn't recommend installing with the get.docker.com script and they also advise against 'blindly' downloading and running that script (which would happen if Jailmaker runs it).

So I decided to keep Jailmaker small. Less code makes it easier for people to decide if they can trust it. And it's less maintenance for me :) Installing docker is still just a few commands away. But by not automating it with Jailmaker everyone can decide for themselves to follow the official install guide or use the install script.

I think Jailmaker should stick to whatever is specific about TrueNAS and systemd-nspawn and leave whatever is generic to the user.

How did you install docker inside the jail?

@Jip-Hop
Copy link
Author

Jip-Hop commented Feb 24, 2023

@Ixian have you tried to use the latest script which mounts the nvidia driver files from the host? Still curious if you can confirm this approach works well.

@Ixian
Copy link

Ixian commented Feb 24, 2023

@Ixian have you tried to use the latest script which mounts the nvidia driver files from the host? Still curious if you can confirm this approach works well.

I’m still using the older one but I will take a look this weekend; been meaning to update.

@Talung
Copy link

Talung commented Feb 28, 2023

Quick question about this approach. Is is possible to run docker in the jail and still run the IX apps as normal, or like that last approach mutually exclusive?

Am looking to migrate to Jailmaker, but was thinking of the possibility of trying the Kubernetes where possible.

@Jip-Hop
Copy link
Author

Jip-Hop commented Feb 28, 2023

Jailmaker should not interfere with the IX kubernetes Apps :)

@Talung
Copy link

Talung commented Feb 28, 2023

Jailmaker should not interfere with the IX kubernetes Apps :)

Excellent news. Best of both worlds. Hopefully they will leave Jailmaker alone :) I will try transition to this soon to add to your testing data. Thanks :)

@qudiqudi
Copy link

@Talung you can even use both in conjunction. I have every service inside jails reverse proxied through the Truecharts Traerfik app via the "external service" app. Though, I plan to ditch k3s altogether, but it works very well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment