Skip to content

Instantly share code, notes, and snippets.

@yorickdowne
Last active May 5, 2024 10:47
Show Gist options
  • Star 54 You must be signed in to star a gist
  • Fork 12 You must be signed in to fork a gist
  • Save yorickdowne/a2a330873b16ebf288d74e87d35bff3e to your computer and use it in GitHub Desktop.
Save yorickdowne/a2a330873b16ebf288d74e87d35bff3e to your computer and use it in GitHub Desktop.
Ubuntu Desktop 20.04 with mirrored ZFS boot drive

Overview

Ubuntu Desktop 20.04 supports a single ZFS boot drive out of the box. I wanted a ZFS mirror, without going through an entirely manual setup of Ubuntu as described by OpenZFS in their instructions for Ubuntu 20.04 and instructions for Ubuntu 22.04

This adds a mirror to an existing Ubuntu ZFS boot drive after the fact. It's been tested on Ubuntu 20.04 by me and all the way up to Ubuntu 22.10 by users in comments.

ZFS requires native encryption to be added at pool / dataset creation. Ubuntu 22.04 supports this during installation. Whether these instructions are suitable for mirroring such a setup has not been tested. For Ubuntu 20.04, these instructions are not suitable for creating an encrypted ZFS boot disk, please use the full instructions linked above for that. You can, however, add an encrypted dataset after the fact: You could encrypt just the portion of your file system that holds secrets.

Alternatives

If your use case is storage with some containers and VMs, and not a full-fledged Ubuntu install, then take a look at TrueNAS SCALE, which will manage the ZFS parts for you.

The ZFS Boot Menu project aims to provide a cleaner, FreeBSD-ish boot experience complete with boot environments and full support for native ZFS encryption. Instructions for Ubuntu 22.04 exist.

You could also boot from a regular ext4 disk, whether single or mirrored, and then use a ZFS mirror pool for just /home and /var.

Why

ZFS has a few advantages that are good to have

  • It uses checksums, which means that hardware failure and disk corruption will be detected and flagged during regular "scrub" operations
  • It supports mirrors, which means that even with a failed drive, data is not lost
  • It is a Copy-on-Write file system, which means that snapshots are fast to create and fast to roll back to (seconds), and only take as much space as what was written after their creation. They can be created on a per-dataset basis.
  • It has the concept of datasets, making it easy to take snapshots of specific portions of the file system, as desired. Automated ZFS snapshots with a rotation lifetime make a lot of sense.
  • It can expand the size of a vdev by replacing first one, then the other drive with a larger one.

ZFS functions unlike traditional file systems such as ext4. Ars Technica has a good introduction to ZFS.

How

Assumptions and requirements

  • All drives will be formatted. These instructions are not suitable for dual-boot
  • No hardware or software RAID is to be used, these would keep ZFS from detecting disk errors and correcting them. In UEFI settings, set controller mode to AHCI, not RAID
  • These instructions are specific to UEFI systems and GPT. If you have an older BIOS/MBR system, please use the full instructions linked above

Initial Ubuntu installation

  • Install from an Ubuntu Desktop 20.04 or later install USB. Ubuntu Server does not offer ZFS boot disk
  • For the "Erase disk and install Ubuntu" option, click "Advanced Features" and choose "Experimental ZFS"
  • Continue install as normal and boot into Ubuntu

Add second drive

Note: @benitogf created a handy script that automates these steps. Use his gist for feedback on that script.

All work will be done from CLI. Open a Terminal. In the following, use copy & paste extensively, it'll help avoid typos. Right-click in Terminal pastes.

  • Update Ubuntu: sudo apt update && sudo apt dist-upgrade
  • Find the names of your two disks: ls -l /dev/disk/by-id. The first disk will have four partitions, the second none.
  • Let's set variables for those disk paths so we can refer to them in the following
DISK1=/dev/disk/by-id/scsi-disk1
DISK2=/dev/disk/by-id/scsi-disk2
  • Install tools: sudo apt install -y gdisk mdadm grub-efi-amd64

Create partitions on second drive

  • List partitions: sudo sgdisk -p $DISK1, you expect to see four of them
  • Change swap partition type: sudo sgdisk -t2:FD00 $DISK1
  • Copy partition table from disk 1 to disk 2: sudo sgdisk -R$DISK2 $DISK1
  • Change GUID of second disk: sudo sgdisk -G $DISK2

Ubuntu 21.04 required a reboot at this point in my testing, so that /dev/disk/by-partuuid was correct.If you need to do that, recreate DISK1 and DISK2 after the reboot.

Mirror boot pool

  • Confirm that disk 1 partition 3 is the device in the bpool by comparing "Partition unique GUID" to the device id shown in zpool status: sudo sgdisk -i3 $DISK1 and zpool status bpool
  • Get GUID of partition 3 on disk 2: sudo sgdisk -i3 $DISK2
  • Add that partition to the pool: sudo zpool attach bpool EXISTING-UID /dev/disk/by-partuuid/DISK2-PART3-GUID, for example sudo zpool attach bpool ac78ee0c-2d8d-3641-97dc-eb8b50abd492 /dev/disk/by-partuuid/8e1830b3-4e59-459c-9c02-a09c80428052
  • Verify with zpool status bpool. You expect to see mirror-0 now, which has been resilvered

Mirror root pool

  • Confirm that disk 1 partition 4 is the device in the rpool by comparing "Partition unique GUID" to the device id shown in zpool status: sudo sgdisk -i4 $DISK1 and zpool status rpool
  • Get GUID of partition 4 on disk 2: sudo sgdisk -i4 $DISK2
  • Add that partition to the pool: sudo zpool attach rpool EXISTING-UID /dev/disk/by-partuuid/DISK2-PART4-GUID, for example sudo zpool attach rpool d9844f27-a1f8-3049-9831-77b51318d9a7 /dev/disk/by-partuuid/d9844f27-a1f8-3049-9831-77b51318d9a7
  • Verify with zpool status rpool. You expect to see mirror-0 now, which either is resilvering or has been resilvered

Mirror swap

  • Remove existing swap: sudo swapoff -a
  • Remove the swap mount line in /etc/fstab: sudo nano /etc/fstab, find the swap line at the end of the file and delete it, then save with Ctrl-x
  • Create software mirror drive for swap: sudo mdadm --create /dev/md0 --metadata=1.2 --level=mirror --raid-devices=2 ${DISK1}-part2 ${DISK2}-part2
  • Configure it for swap: sudo mkswap -f /dev/md0
  • Place it into fstab: sudo sh -c "echo UUID=$(sudo blkid -s UUID -o value /dev/md0) none swap discard 0 0 >> /etc/fstab"
  • Verify that line is in fstab: cat /etc/fstab
  • Use the new swap: sudo swapon -a
  • And verify: swapon -s

Move GRUB boot menu to ZFS

  • Verify grub can "see" the ZFS boot pool: sudo grub-probe /boot
  • Create EFI file system on second disk: sudo mkdosfs -F 32 -s 1 -n EFI ${DISK2}-part1
  • Remove /boot/grub from fstab: sudo nano /etc/fstab, find the line for /boot/grub and remove it. Leave the line for /boot/efi in place. Save with Ctrl-x.
  • Unmount /boot/grub: sudo umount /boot/grub
  • Verify with df -h, /boot should be mounted on bpool/BOOT/ubuntu_UID, /boot/efi on /dev/sda1 or similar depending on device name of your first disk, and no /boot/grub
  • Remove /boot/grub: sudo rm -rf /boot/grub
  • And create a ZFS dataset for it: sudo zfs create -o com.ubuntu.zsys:bootfs=no bpool/grub
  • Refresh initrd files: sudo update-initramfs -c -k all
  • Disable memory zeroing to address a performance regression of ZFS on Linux: sudo nano /etc/default/grub and add init_on_alloc=0 to GRUB_CMDLINE_LINUX_DEFAULT, it'll likely look like this: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash init_on_alloc=0". Save with Ctrl-x
  • Update the boot config: sudo update-grub and ignore any errors you may see from osprober
  • Make sure systemd services are up to date: sudo systemctl daemon-reload. As per @jonkiszp in comments, this is required for Ubuntu 22.10 to still boot after these changes.
  • Install GRUB to the ESP: sudo grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=ubuntu --recheck --no-floppy
  • Disable grub-initrd-fallback.service: sudo systemctl mask grub-initrd-fallback.service. This is the service for /boot/grub/grubenv which does not work on mirrored or raidz topologies. Disabling this keeps it from blocking subsequent mounts of /boot/grub if that mount ever fails.

Reboot and install GRUB to second disk

  • Cross fingers and reboot! sudo reboot
  • Once back up, open a Terminal again and install GRUB to second disk: sudo dpkg-reconfigure grub-efi-amd64 , keep defaults and when it comes to system partitions, use space bar to select first partition on both drives, e.g. /dev/sda1 and /dev/sdb1
  • And for good measure signed efi: sudo dpkg-reconfigure grub-efi-amd64-signed , this should finish without prompting you
  • If you like, you can remove the primary drive and reboot. You expect reboot to take a little longer, and to be successful. zpool status should show degraded pools without error

Replacing a failed drive

If a mirrored drive fails, you can replace it by following a similar method as adding a second drive in the first place.

First, find the id of the replacement drive with ls -l /dev/disk/by-id and create a variable for it:

NEWDISK=/dev/disk/by-id/NEWDRIVEID

The new drive may already contain ZFS or mdadm signatures. Check using sudo wipefs $NEWDISK. If that output is not empty, run sudo wipefs -a $NEWDISK.

Create partition table on replacement disk

  • Copy partition table from existing disk to replacement disk: sudo sgdisk -R$NEWDISK /dev/disk/by-id/ID-OF-EXISTING-DRIVE
  • Change GUID of replacement disk: sudo sgdisk -G $NEWDISK

Repair boot pool

  • Get the ID of the "UNAVAIL" disk on bpool with zpool status bpool
  • Get GUID of partition 3 on the replacement disk: sudo sgdisk -i3 $NEWDISK
  • Replace the failed member with that partition: sudo zpool replace bpool EXISTING-UID /dev/disk/by-partuuid/NEWDISK-PART4-GUID, for example sudo zpool replace bpool 6681469899058372901 /dev/disk/by-partuuid/06f5ef6d-cb69-45e8-ad3b-c69cad5c216a
  • Verify with zpool status bpool. You expect to see state "ONLINE" for the pool and both devices in mirror-0.

Repair root pool

  • Get the ID of the "UNAVAIL" disk on rpool with zpool status rpool
  • Get GUID of partition 4 on the replacement disk: sudo sgdisk -i4 $NEWDISK
  • Replace the failed member with that partition: sudo zpool replace rpool EXISTING-UID /dev/disk/by-partuuid/NEWDISK-PART4-GUID, for example sudo zpool replace rpool 8712274632631823759 /dev/disk/by-partuuid/8c4ec74f-cd4d-4048-bfca-b4a58756563d
  • Verify with zpool status rpool. You expect to see state "ONLINE" for the pool and both devices in mirror-0, or state "DEGRADED" for the pool with "resilver in progress" and a "replacing-0" entry under "mirror-0"

Repair swap

  • Verify that the failed disk shows as "removed": sudo mdadm -D /dev/md0
  • Add partition 2 of the replacement disk: sudo mdadm /dev/md0 --add ${NEWDISK}-part2
  • And verify that you can see "spare rebuilding" or "active sync": sudo mdadm -D /dev/md0

Repair EFI

  • Create EFI file system on replacement disk: sudo mkdosfs -F 32 -s 1 -n EFI ${NEWDISK}-part1
  • Install GRUB to replacement disk: sudo dpkg-reconfigure grub-efi-amd64 , keep defaults and when it comes to system partitions, use space bar to select first partition on both drives, e.g. /dev/sda1 and /dev/sdb1
  • And for good measure signed efi: sudo dpkg-reconfigure grub-efi-amd64-signed , this should finish without prompting you

Test

If you like, test by rebooting: sudo reboot, and confirm that pools are healthy after reboot with zpool status

Increasing drive space

Similar to replacing a failed drive, just that partition 4, the rpool partition, will be bigger. Wait for resilver after replacement, then replace the second drive. Once both drives have been replaced, rpool has the new capacity.

First, find the id of the replacement drive with ls -l /dev/disk/by-id and create a variable for it:

NEWDISK=/dev/disk/by-id/NEWDRIVEID

The new drive may already contain ZFS or mdadm signatures. Check using sudo wipefs $NEWDISK. If that output is not empty, run sudo wipefs -a $NEWDISK.

Create partition table on replacement disk

  • Copy partition table from existing disk to replacement disk: sudo sgdisk -R$NEWDISK /dev/disk/by-id/ID-OF-EXISTING-DRIVE
  • Change GUID of replacement disk: sudo sgdisk -G $NEWDISK
  • Remove partition 4: sudo sgdisk -d4 $NEWDISK
  • Recreate partition 4 with maximum size: sudo sgdisk -n4:0:0 -t4:BF00 $NEWDISK
  • Tell the kernel to use the new partition table: sudo partprobe
  • Tell ZFS to use expanded space automatically: sudo zpool set autoexpand=on rpool

Repair boot pool, root pool, swap and EFI

Follow the instructions under "Replacing a failed drive", starting from "Repair boot pool". Wait for resilver to complete afterwards. Then, run through these instructions again, replacing the second drive. Once resilver is done a second time, you will have the new capacity on the rpool.

Youtube

I did a walkthrough of these instructions.

@SamMousa
Copy link

SamMousa commented Sep 14, 2021

I can't confirm this at the moment, but as far as I know, fully removing the disk with swap isn't even an issue.

One way to test would be to add a swap entry to fstab that points to a non-existing node. For example by doing this:

Place it into fstab: sudo sh -c "echo UUID=FAKE-UID-HERE none swap discard 0 0 >> /etc/fstab"

Note that it might need to look like a real one, no idea how fstab would handle it having a different format. Alternatively for testing use the path syntax and the path /dev/non-existent-disk or something like that.

@madu41
Copy link

madu41 commented Sep 14, 2021

I do not have a computer that supports hot-swap of disks, else pulling the disk with swap on could be a test. Comparing with and without swap on mdraid.

@salbright2192
Copy link

Doesn't zfs have a hot spare option? Is it possible to have a third disk in the system set to auto-install should a disk fail? If so, how to set that up?

@madu41
Copy link

madu41 commented Nov 1, 2021

When you have a mirror I am not sure if "hot spare" is applicable, but you can add a third and more disks to the mirror, which will increase the redundancy. All three disks will then have the same content, and then two disks can fail without loosing data. You can follow the applicable parts of the instructions for adding the second disk.

@karaiwulf
Copy link

@salbright2192 yeah, zfs has the hot spare option for any type of redundancy (raidz, raidz2, mirror). They are only good when you have a stripe across multiple vdevs though (multiple mirror/raidz sets, that is). The hotspare will automatically start resilvering if zfs detects a fault on one of the disks. Nifty feature. Should be able to set one up using zpool add rpool spare <device>. You'll need some extra software to auto-setup anything that isn't directly controlled by zfs, though. I'd recommend adding the disk directly to the mirror if you don't have any plans to build a mirror stripe or any other fancy zfs configurations.

@SamMousa you should be correct in assuming that swapdisks shouldn't affect boot. Easy way to test is literally to create a swap file dd if=/dev/zero bs=1024M of=/swapfile then add it to the swap and do a swapon for it. Once its setup, you can test by rm /swapfile and reboot as applicable. Mirroring the swap is still a good idea though. On a running system, swap failure may cause system instability. swap is used for a lot of memory management stuff (like orphaned pages, not-frequently accessed pages, etc). Applications and the kernel itself might be depending on what's in swap.

@enoch85
Copy link

enoch85 commented Mar 23, 2022

Really great, thanks a ton!

@yorickdowne
Copy link
Author

Glad it’s helpful! I am looking forward to seeing what 22.04 adds in terms of encryption “out of the box”

@enoch85
Copy link

enoch85 commented Apr 26, 2022

OK, so Ubuntu 22.04 is out. Did anyone test if the same procedure works (as above)?

@rmb7984
Copy link

rmb7984 commented May 3, 2022

OK, so Ubuntu 22.04 is out. Did anyone test if the same procedure works (as above)?

I was brave (or stupid) and tried this procedure with Ubuntu 22.04. It does seem to work, with one exception.

I was not successful getting my system to boot from Disk2 after install. Fortunately the steps mentioned by madu41 on May 23, 2021 seem to have fixed the issue. I'd be lying if I said I understood this!

But, this stuff is fun and I'm going to try the rEFInd/zfsbootmenu approach next.

Thanks for the walkthrough!

@madu41
Copy link

madu41 commented May 3, 2022

I installd Ubuntu 21.04 with zfs and added a second disk with the steps described above, and have not seen any issues. I upgraded to 21.10 after the issue with zfs in the kernel was fixed, and I have not see any issues. Next step will be to upgrade to 22.04, but I will probably wait a little while first.

@enoch85
Copy link

enoch85 commented May 23, 2022

Just tested on Ubuntu 22.04, and it worked! Followed the advice in the guide to reboot to get the new partuuid (for 21.04), but everything was copy&paste!

Yay!

@yorickdowne
Copy link
Author

Interesting re https://gist.github.com/yorickdowne/a2a330873b16ebf288d74e87d35bff3e?permalink_comment_id=3754037#gistcomment-3754037 , I'll need to (re)test this on 22.04 to see how it behaves with EFI on that release.

@rmb7984
Copy link

rmb7984 commented May 23, 2022

That would be fantastic. I still haven't been able to determine if I did something wrong or not. But it's starting to look like I need to try again from the beginning!

@enoch85
Copy link

enoch85 commented May 23, 2022

I never tested to disable either one of the two disks, but maybe I should try that...

@madu41
Copy link

madu41 commented May 23, 2022

Just tested on Ubuntu 22.04, and it worked! Followed the advice in the guide to reboot to get the new partuuid (for 21.04), but everything was copy&paste!

@enoch85, did you upgrade to 22.04 or install it from scratch?

@enoch85
Copy link

enoch85 commented May 23, 2022

@madu41 I installed from scratch.

@enoch85
Copy link

enoch85 commented May 24, 2022

OK, so I tried again, and this time disabled the boot option for the first disk in BIOS, and it booted as normal. It seems promising!

Would be nice if some one else could confirm the same thing.

@madu41
Copy link

madu41 commented Jul 27, 2022

I have now upgraded to Ubuntu 22.04 with success. It was orginally installed with 21.04 on ZFS following the instructions on this page and then upgraded to 21.10 and now 22.04. Since it is LTS it will probably stay there for some years.

@jonkiszp
Copy link

jonkiszp commented Feb 3, 2023

I have a problem witch ubntu 22.10 in section grub. Make all command if your tutorial and my system not boting stoping in grub shell.

Solution after step

Update the boot config: sudo update-grub and ignore any errors you may see from osprober

is sudo systemctl daemon-relod and work the rest of the instructions

@yorickdowne
Copy link
Author

No idea tbh - and I've lost the system I was testing this on. If you figure that out, please share!

@pasha-19
Copy link

pasha-19 commented Feb 7, 2023

Has anyone considered your alternative option? It has my interest, Specifically the use of OpenZFS which I hear may be more up to date that than the Ubuntu version, I am a retired IT professional, not by any means a Linux expert. Is it worth attempting for the reason I believe to be valid? Any estimate on the potential for success? As i see it if this succeeds then instead of playing with TruNAS Scale I may well be tied to a linux server. I managed to load OpenZFS onto a Ras PI 3B in terms of installation, I did not actually run it for long though.

@yorickdowne
Copy link
Author

yorickdowne commented Feb 7, 2023

@pasha-19 I haven't tried it, and, the ZFS Boot Menu team has a good rep. OpenZFS on Ubuntu works like a charm with the Jonathon F repo at https://launchpad.net/~jonathonf/+archive/ubuntu/zfs

@crumdev
Copy link

crumdev commented Apr 28, 2023

Has anyone done this to use Raidz instead of Mirror? I have an old QNAP TS-453A nas with 4x 3TB drives. I followed the instructions up to the point where I was adding the additional partitions to the boot pool but saw that it was creating a mirror. I'm using Ubuntu 22.04. Other than that though the instructions have worked great.

@yorickdowne
Copy link
Author

raidz cannot be created after the fact. For that, use the “from scratch” openzfs instructions that are linked up top.

@rrsis
Copy link

rrsis commented May 8, 2023

I'm unclear on the rationale for using mdraid for swap.
I agree with you, i think it is not priority to raid the swap, but raiding it doesn't bit either.
I read that ZFS has its own swap system called "ZFS Swap space" which one can use instead of the traditional one.
Didn't try it yet.

Advantages of using "ZFS swap space":

  • ZFS integration, with ZFS file system, better control and management of swap space.
  • Snapshots and clones, you can take advantage of ZFS snapshot and clone capabilities.

I didn't try snapshots yet, I don't know if it's a must to have "zfs swap space" in order to use snapshots.

@rrsis
Copy link

rrsis commented May 9, 2023

Hi guys, here on 22.04.02 LTS.
If any of you have troubles booting from sdb, I let here another workaround which I think it will also work on 22.10
I did try the recommendation by madu41 using /boot/efi for sda and /boot/efi1 for sdb but I couldn't make it work either.

I've solved the problem in the fstab configuration file, using /dev/sda1 for /boot/efi instead of uuid and it works smoothly.
Just for the boot line. Everywhere else I have used drives ids just as the author recommends.

When you pull out sda simulating it's broken, sdb was now named sda by the system, (that's how Linux sequentially name disks works, the first one recognized will always be sda following the motherboard sata sequence number), and it mount /boot/efi with no problem and startup the system smoothly. Then I did some changes, rebooted, attached the sda original disk, and again it booted smoothly. ZFS resilver everything. It works Awesome!! Thanks for this Document

@benitogf
Copy link

benitogf commented Aug 4, 2023

hey I had to do several machines and put the tutorial in a script leaving it here, thanks @yorickdowne https://gist.github.com/benitogf/aa10ada1071c827aa5e012dba168ada7

@yorickdowne
Copy link
Author

Amazing! Linked it

@forcedawg
Copy link

These exact instructions are not working on Ubuntu 24.04. Ubuntu has changed the naming of ZFS partitions, partition 2 and 3 are switched around, and the boot/efi folder is now different.

@yorickdowne
Copy link
Author

Good to know, thank you. I’ll stick a note at the top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment