Skip to content

Instantly share code, notes, and snippets.

@cyrinux
Forked from p7cq/Arch_Linux_Root_On_ZFS.md
Created November 26, 2020 20:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cyrinux/dc12e0ae85c58a782307fa0481a6b509 to your computer and use it in GitHub Desktop.
Save cyrinux/dc12e0ae85c58a782307fa0481a6b509 to your computer and use it in GitHub Desktop.
Install Arch Linux with Root on ZFS

Arch Linux Root on ZFS

Installation steps for running Arch Linux with root on ZFS using UEFI and systemd-boot. All steps are run as root.

In live environment

  • Set a bigger font if needed:
setfont latarcyrheb-sun32
  • To connect to the internet add the ESSID and passphrase:
 wpa_passphrase ESSID PASSPHRASE > /etc/wpa_supplicant/wpa_supplicant.conf
  • Start wpa_supplicant and get an IP address:
wpa_supplicant -B -c /etc/wpa_supplicant/wpa_supplicant.conf -i <wifi interface>
dhclient <wifi interface>
  • Wipe disks, create boot, swap and ZFS partitions:
sgdisk --zap-all /dev/disk/by-id/<disk0>
sgdisk -n1:0:+512M -t1:ef00 /dev/disk/by-id/<disk0>
sgdisk -n2:1052672:+8G -t2:8200 /dev/disk/by-id/<disk0>
sgdisk -n3:17829888:0 -t3:bf00 /dev/disk/by-id/<disk0>

sgdisk --zap-all /dev/disk/by-id/<disk1>
sgdisk -n1:0:+512M -t1:ef00 /dev/disk/by-id/<disk1>
sgdisk -n2:1052672:+8G -t2:8200 /dev/disk/by-id/<disk1>
sgdisk -n3:17829888:0 -t3:bf00 /dev/disk/by-id/<disk1>

  • Format boot and swap partitions
mkfs.vfat /dev/disk/by-id/<disk0>-part1
mkfs.vfat /dev/disk/by-id/<disk1>-part1

mkswap /dev/disk/by-id/<disk0>-part2
mkswap /dev/disk/by-id/<disk1>-part2
swapon /dev/disk/by-id/<disk0>-part2 /dev/disk/by-id/<disk1>-part2
  • On choosing ashift

You should specify an ashift when that value is too low for what you actually need, either today (disk lies) or into the future (replacement disks will be AF). Looks like a sound advice to me. If in doubt, clarify before going any further. I will go with the defaults here, which will trigger autodetection.

Disk block size check:

cat /sys/class/block/<disk>/queue/{phys,log}ical_block_size
  • Create pool (here's for a RAID 0 equivalent):
zpool create \
    -O atime=off \
    -O acltype=posixacl -O canmount=off -O compression=lz4 \
    -O dnodesize=legacy -O normalization=formD \
    -O xattr=sa -O devices=off -O mountpoint=none -R /mnt  rpool /dev/disk/by-id/<disk0>-part3 /dev/disk/by-id/<disk1>-part3

If the pool is larger than 10 disks you should identify them by-path or by-vdev (see here for more details).

Check ashift with:

zdb -C | grep ashift
  • Create datasets:
zfs create -o canmount=off -o mountpoint=none rpool/ROOT
zfs create -o mountpoint=/ -o canmount=noauto rpool/ROOT/default
zfs create -o mountpoint=none rpool/DATA
zfs create -o mountpoint=/home rpool/DATA/home
zfs create -o mountpoint=/root rpool/DATA/home/root
zfs create -o mountpoint=/local rpool/DATA/local
zfs create -o mountpoint=none rpool/DATA/var
zfs create -o mountpoint=/var/log rpool/DATA/var/log # after a rollback, systemd-journal blocks at reboot without this dataset

zpool set bootfs=rpool/ROOT/default rpool
  • Create swap (not needed if you have dedicated partitions, like above)
zfs create -V 16G -b $(getconf PAGESIZE) -o compression=zle -o logbias=throughput -o sync=always -o primarycache=metadata -o secondarycache=none -o com.sun:auto-snapshot=false rpool/swap
mkswap /dev/zvol/rpool/swap
  • Unmount all
zfs umount -a
rm -rf /mnt/*
  • Export pool:
zpool export rpool
  • Re import it:
zpool import -d /dev/disk/by-id -R /mnt rpool -N
  • Mount root, then the other datasets:
zfs mount rpool/ROOT/default
zfs mount -a
  • Mount boot partition:
mkdir /mnt/boot
mount /dev/disk/by-id/<disk0>-part1 /mnt/boot
  • Generate fstab:
mkdir /mnt/etc
genfstab -U /mnt >> /mnt/etc/fstab
  • Add swap (not needed if you created swap partitions, like above):
echo "/dev/zvol/rpool/swap    none       swap  discard                    0  0" >> /mnt/etc/fstab
  • Install the base system:
pacstrap /mnt base base-devel linux linux-firmware vim

If it fails, add GPG keys (see bellow).

  • Change root into the new system:
arch-chroot /mnt

In chroot

  • Remove all lines in /etc/fstab, leaving only the entries for swap and boot.

  • Add ZFS repository in /etc/pacman.conf:

[archzfs]
Server = http://archzfs.com/$repo/x86_64
  • Add GPG keys:
curl -O https://archzfs.com/archzfs.gpg
pacman-key -a archzfs.gpg
pacman-key --lsign-key DDF7DB817396A49B2A2723F7403BD972F75D9D76
pacman -Syy
  • Configure time zone (change accordingly):
ln -sf /usr/share/zoneinfo/Region/City /etc/localtime
hwclock --systohc
  • Generate locale (change accordingly):
sed -i 's/#\(en_US\.UTF-8\)/\1/' /etc/locale.gen
locale-gen
echo "LANG=en_US.UTF-8" > /etc/locale.conf
  • Configure vconsole, hostname, hosts:
echo -e "KEYMAP=us\n#FONT=latarcyrheb-sun32" > /etc/vconsole.conf
echo al-zfs > /etc/hostname
echo -e "127.0.0.1 localhost\n::1 localhost" >> /etc/hosts
  • Set root password

  • Install ZFS, microcode etc:

pacman -Syu archzfs-linux amd-ucode networkmanager sudo openssh rsync borg

I choose the default options for the archzfs-linux group: zfs-linux, zfs-utils, and mkinitcpio for initramfs.

  • Generate host id:
zgenhostid $(hostid)
  • Create cache file:
zpool set cachefile=/etc/zfs/zpool.cache rpool
  • Configure initial ramdisk in /etc/mkinitcpio.conf:
HOOKS=(base udev autodetect modconf block keyboard zfs filesystems)

and regenerate it:

mkinitcpio -p linux
  • Enable ZFS services:
systemctl enable zfs.target
systemctl enable zfs-import-cache.service
systemctl enable zfs-mount.service
systemctl enable zfs-import.target
  • Install the bootloader:
bootctl --path=/boot install
  • Add an EFI boot manager update hook in /etc/pacman.d/hooks/100-systemd-boot.hook:
[Trigger]
Type = Package
Operation = Upgrade
Target = systemd

[Action]
Description = update systemd-boot
When = PostTransaction
Exec = /usr/bin/bootctl update
  • Replace content of /boot/loader/loader.conf with:
default arch
timeout 3
# bigger boot menu on a 4K display
#console-mode 1
  • Create a /boot/loader/entries/arch.conf containing:
title Arch Linux
linux /vmlinuz-linux
initrd /amd-ucode.img
initrd /initramfs-linux.img
options zfs=rpool/ROOT/default rw
  • Exit and unmount all:
exit
zfs umount -a
umount -R /mnt
  • Export pool:
zpool export rpool
  • Reboot

A minimal Arch Linux system with root on ZFS should now be configured.

Optional

  • Create user
zfs create rpool/DATA/home/user
groupadd -g 1234 group
useradd -g group -u 1234 -d /home/user -s /bin/bash user
cp /etc/skel/.bash* /home/user
chown -R user:group /home/user && chmod 700 /home/user
  • Create non-root pools:
zpool create \
    -O atime=off \
    -O acltype=posixacl -O canmount=off -O compression=lz4 \
    -O dnodesize=auto \
    -O xattr=sa -O devices=off -O mountpoint=none pool:a /dev/disk/by-id/<disk2> /dev/disk/by-id/<disk3>

zpool create \
    -O atime=off \
    -O acltype=posixacl -O canmount=off -O compression=lz4 \
    -O dnodesize=auto \
    -O xattr=sa -O devices=off -O mountpoint=none pool:b mirror /dev/disk/by-id/<disk4> /dev/disk/by-id/<disk5> mirror /dev/disk/by-id/<disk6> /dev/disk/by-id/<disk7>
  • Create non-root datasets:
zfs create -o canmount=off -o mountpoint=none pool:a/DATA
zfs create -o mountpoint=/path pool:a/DATA/path
zfs create -o mountpoint=/path/games -o recordsize=1M pool:a/DATA/path/games
zfs create -o mountpoint=/path/transmission -o recordsize=1M pool:a/DATA/path/transmission
zfs create -o mountpoint=/path/backup -o compression=off pool:a/DATA/path/backup
  • Create NFS share:

Set Domain in idmapd.conf on server and clients.

zfs set sharenfs=rw=@10.0.0.0/24 pool:a/DATA/path/name
systemctl enable nfs-server.service zfs-share.service --now

References:

  1. https://wiki.archlinux.org/index.php/Install_Arch_Linux_on_ZFS
  2. https://wiki.archlinux.org/index.php/ZFS
  3. https://ramsdenj.com/2016/06/23/arch-linux-on-zfs-part-2-installation.html
  4. https://github.com/reconquest/archiso-zfs
  5. https://zedenv.readthedocs.io/en/latest/setup.html
  6. https://docs.oracle.com/cd/E37838_01/html/E60980/index.html
  7. https://ramsdenj.com/2020/03/18/zectl-zfs-boot-environment-manager-for-linux.html
  8. https://superuser.com/questions/1310927/what-is-the-absolute-minimum-size-a-uefi-partition-can-be, https://systemd.io/9OOT_LOADER_SPECIFICATION/
  9. OpenZFS Admin Documentation
  10. zfs(8)
  11. zpool(8)
  12. https://jrs-s.net/category/open-source/zfs/
#!/usr/bin/env bash
#
# zsrs, a ZFS snapshot/rollback script
#
# @p7cq
ss_to_keep=0
bootmark="BOOT"
root_pool="rpool/ROOT/default"
root_ss="/.zfs/snapshot"
boot_ss="/local/.boot/snapshot/${bootmark}"
boot_part="/boot"
loader_file="${boot_part}/loader/entries/arch.conf"
action=${1}
ss_date=$(/usr/bin/date +"%A %B %e, %Y %k:%M:%S")
ss_manifest_file=".manifest"
target_dataset=${root_pool}
mount_name="snapshot"
function _help() {
local topic=${1}
[[ ${topic} == "usage" ]] && \
/usr/bin/echo -e "usage:\n \
zsrs delete [snapshot id] remove all snapshots keeping only the last ${ss_to_keep}\n \
zsrs snapshot [comment] snapshot the system; comment max length 64 chars\n \
zsrs rollback [snapshot id] rollback the system to the previous state (the last snapshot)\n \
zsrs mount <snapshot id> clone snapshot\n \
zsrs umount <snapshot id> destroy clone\n \
zsrs list list snapshots\n \
zsrs help usage" && \
exit 0
}
function _boot_ss() {
[[ -d "${1}" ]] && /usr/bin/echo "\e[97m\e[1mboot snapshot id is undefined\e[0m" && exit 1
local ss_id="${1}"
/usr/bin/mkdir ${boot_ss}/${ss_id}
/usr/bin/echo ${ss_manifest} > ${boot_ss}/${ss_id}/${ss_manifest_file}
/usr/bin/rsync -avzq --delete ${boot_part}/ ${boot_ss}/${ss_id}/${bootmark}
}
function _boot_rb() {
local ss_id="${1}"
[[ -z "${ss_id}" ]] && /usr/bin/echo "\e[97m\e[1mboot snapshot id is undefined\e[0m" && exit 1
/usr/bin/echo -e "\e[97m\e[1moverwriting boot partition at \e[0m\e[33m\e[1m${boot_part}\e[0m"
/usr/bin/rsync -avzq --delete ${boot_ss}/${ss_id}/${bootmark}/ ${boot_part}
IFS='^' read -ra manifest <<< "$(/usr/bin/cat ${boot_ss}/${ss_id}/${ss_manifest_file})"
local title="title Arch Linux (${ss_id}: ${manifest[0]} ${manifest[1]})"
/usr/bin/sed -i "1 s/^.*$/${title}/g" ${loader_file}
}
function _ss_manifest() {
local comment="${1}"
ss_manifest="^${ss_date}"
[[ -z "${comment}" ]] && comment="snapshot"
comment=$(/usr/bin/sed 's/\^/_/g' <<< ${comment})
ss_manifest="${comment:0:64}${ss_manifest}"
}
function _restrict() {
if (( ${EUID:-0} || "$(/usr/bin/id -u)" )); then
/usr/bin/echo -e "\e[97m\e[1myou must be root to do this\e[0m"
exit 1
fi
}
declare -a args=("delete" "d" "snapshot" "s" "rollback" "r" "list" "l" "help" "h" "mount" "m" "umount" "u")
[[ "$#" = "0" ]] || [[ ! " ${args[@]} " =~ " ${1} " ]] && _help "usage" && exit 0
[[ ! -d ${root_ss} ]] && /usr/bin/echo "${root_ss} does not exist" && exit 1
[[ ! -d ${boot_ss} ]] && /usr/bin/echo "${boot_ss} does not exist" && exit 1
ss_count=$(/usr/bin/ls ${boot_ss} | /usr/bin/wc -l)
if [[ ${action} == "delete" ]] || [[ ${action} == "d" ]]; then
_restrict
ss_id=${2}
if (( $# == 1 )); then
/usr/bin/echo -en "\e[97m\e[1mdelete all and keep the last \e[0m\e[33m\e[1m${ss_to_keep}\e[0m\e[97m\e[1m snapshots? (yes/no): \e[0m" && read
[[ "${REPLY}" != "yes" ]] && exit 0
fi
if [[ ! -z "${ss_id}" ]]; then
re='^[0-9]+$'
[[ ! ${ss_id} =~ ${re} ]] && /usr/bin/echo -e "\e[97m\e[1ma positive integer is expected (see \e[0m\e[33m\e[1mzsrs list\e[0m)\e[0m" && exit 1
[[ ! -d "${root_ss}/${ss_id}" ]] && /usr/bin/echo -e "\e[97m\e[1mno such snapshot\e[0m" && exit 1
/usr/bin/zfs destroy ${root_pool}@${ss_id}
/usr/bin/rm -rf ${boot_ss}/${ss_id}
exit 0
fi
if (( ss_count > ss_to_keep )); then
lwm=$(/usr/bin/ls ${root_ss} | /usr/bin/sort -n | /usr/bin/head -1)
hwm=$(/usr/bin/ls ${root_ss} | /usr/bin/sort -nr | /usr/bin/head -1)
end=$((hwm-ss_to_keep))
if [ ! -z "${lwm}" ] && [ ! -z "${hwm}" ]; then
for ((s=lwm; s<=end; s++)) do
[[ ! -d "${boot_ss}/${s}" ]] && continue
/usr/bin/zfs destroy ${root_pool}@${s}
/usr/bin/rm -rf ${boot_ss}/${s}
done
fi
else
/usr/bin/echo -e "\e[97m\e[1mless than \e[0m\e[33m\e[1m$((${ss_to_keep}+1))\e[0m\e[97m\e[1m snapshots, none deleted\e[0m"
fi
exit 0
fi
if [[ ${action} == "snapshot" ]] || [[ ${action} == "s" ]]; then
_restrict
ss_id=$(($(/usr/bin/ls ${boot_ss} | /usr/bin/sort -nr | /usr/bin/head -1)+1))
_ss_manifest "${2}"
_boot_ss ${ss_id}
/usr/bin/zfs snapshot ${root_pool}@${ss_id}
/usr/bin/echo -e "\e[97m\e[1mcreated snapshot with id \e[0m\e[33m\e[1m${ss_id}\e[0m"
exit 0
fi
if [[ ${action} == "rollback" ]] || [[ ${action} == "r" ]]; then
_restrict
ss_id=${2}
last_ss=$(/usr/bin/ls ${boot_ss} | /usr/bin/sort -nr | /usr/bin/head -1)
if [[ ! -z "${ss_id}" ]]; then
([[ "${ss_id}" == "0" ]] || [[ ! -d "${boot_ss}/${ss_id}" ]] || [[ ! -d "${root_ss}/${ss_id}" ]]) && /usr/bin/echo -e "\e[97m\e[1mno such snapshot\e[0m" && exit 1
re='^[0-9]+$'
[[ ! ${ss_id} =~ ${re} ]] && /usr/bin/echo -e "\e[97m\e[1ma positive integer is expected\e[0m" && exit 1
else
ss_id=${last_ss}
fi
IFS='^' read -ra manifest <<< "$(/usr/bin/cat ${boot_ss}/${ss_id}/${ss_manifest_file})"
/usr/bin/echo -e "\e[97m\e[1mrolling back to snapshot \e[0m\e[33m\e[1m${ss_id}\e[0m\e[97m\e[1m: \e[0m\e[33m\e[1m${manifest[0]}\e[97m\e[1m taken on \e[0m\e[33m\e[1m${manifest[1]}\e[0m"
/usr/bin/echo -en "\e[97m\e[1ma reboot will be required, proceed? (yes/no): \e[0m" && read
[[ "${REPLY}" != "yes" ]] && exit 0
destroy_recent=""
if (( ss_id < last_ss )); then
read -p $'\e[97m\e[1mmore recent snapshots exist, delete them? (yes/no): \e[0m' answer
[[ "${answer}" != "yes" ]] && /usr/bin/echo -e "\e[97m\e[1mexiting...\e[0m" && exit 0
destroy_recent="-r"
for ((s=last_ss; s>ss_id; s--)); do
more_recent_ss="${boot_ss}/${s}"
[[ -d "${more_recent_ss}" ]] && /usr/bin/rm -rf ${more_recent_ss}
done
fi
/usr/bin/zfs rollback ${destroy_recent} ${root_pool}@${ss_id}
_boot_rb ${ss_id}
/usr/bin/echo -e "\e[97m\e[1msnapshot \e[0m\e[33m\e[1m${manifest[0]}\e[0m\e[97m\e[1m is now the active system, rebooting...\e[0m"
sleep 2
/usr/bin/systemctl reboot
fi
if [[ ${action} == "mount" ]] || [[ ${action} == "m" ]]; then
_restrict
ss_id=${2}
re='^[0-9]+$'
[[ ! ${ss_id} =~ ${re} ]] && /usr/bin/echo -e "\e[97m\e[1ma positive integer is expected\e[0m" && exit 1
([[ "${ss_id}" == "0" ]] || [[ ! -d "${boot_ss}/${ss_id}" ]] || [[ ! -d "${root_ss}/${ss_id}" ]]) && /usr/bin/echo -e "\e[97m\e[1mno such snapshot\e[0m" && exit 1
mountpoint=$(/usr/bin/zfs get mountpoint -o value ${target_dataset} | /usr/bin/sed -n 2p)
filesystem=${mount_name}:${ss_id}
[[ -d "${mountpoint}${filesystem}" ]] && /usr/bin/echo -e "\e[97m\e[1msnapshot \e[0m\e[33m\e[1m${ss_id}\e[0m\e[97m\e[1m is already mounted at \e[33m\e[1m${mountpoint}${filesystem}\e[0m" && exit 1
/usr/bin/zfs clone ${root_pool}@${ss_id} ${target_dataset}/${filesystem}
/usr/bin/echo -e "\e[97m\e[1msnapshot \e[0m\e[33m\e[1m${ss_id}\e[0m\e[97m\e[1m available at \e[33m\e[1m${mountpoint}${filesystem}\e[0m"
fi
if [[ ${action} == "umount" ]] || [[ ${action} == "u" ]]; then
_restrict
ss_id=${2}
re='^[0-9]+$'
[[ ! ${ss_id} =~ ${re} ]] && /usr/bin/echo -e "\e[97m\e[1ma positive integer is expected\e[0m" && exit 1
([[ "${ss_id}" == "0" ]] || [[ ! -d "${boot_ss}/${ss_id}" ]] || [[ ! -d "${root_ss}/${ss_id}" ]]) && /usr/bin/echo -e "\e[97m\e[1mno such snapshot\e[0m" && exit 1
mountpoint=$(/usr/bin/zfs get mountpoint -o value ${target_dataset} | /usr/bin/sed -n 2p)
filesystem=${mount_name}:${ss_id}
[[ ! -d "${mountpoint}${filesystem}" ]] && /usr/bin/echo -e "\e[97m\e[1msnapshot \e[0m\e[33m\e[1m${ss_id}\e[0m\e[97m\e[1m is not mounted\e[0m" && exit 1
/usr/bin/zfs destroy ${target_dataset}/${filesystem}
/usr/bin/echo -e "\e[97m\e[1msnapshot \e[0m\e[33m\e[1m${ss_id}\e[0m\e[97m\e[1m unmounted\e[0m"
fi
if [[ ${action} == "help" ]] || [[ ${action} == "h" ]]; then
_help "usage"
exit 0
fi
if [[ ${action} == "list" ]] || [[ ${action} == "l" ]]; then
if ((ss_count == 0)); then
/usr/bin/echo -e "\e[97m\e[1mno snapshots available\e[0m"
exit 0
fi
for id in $(/usr/bin/ls ${boot_ss} | /usr/bin/sort -nr); do
IFS='^' read -ra manifest <<< "$(/usr/bin/cat ${boot_ss}/${id}/${ss_manifest_file})"
/usr/bin/echo -e "\e[33m\e[1m${id}\e[0m^\e[97m\e[1m${manifest[0]}^${manifest[1]}\e[0m"
done | /usr/bin/column -t -N ID,COMMENT,DATE -s "^"
exit 0
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment