Skip to content

Instantly share code, notes, and snippets.

@brettinternet
Last active December 10, 2023 02:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brettinternet/d683044982d5984cb65d57f314030964 to your computer and use it in GitHub Desktop.
Save brettinternet/d683044982d5984cb65d57f314030964 to your computer and use it in GitHub Desktop.
Recovery from ZFS oops

The stages of ZFS grief

Disk passthrough to a VM managing my ZFS array.

$ qm set 100 -scsi1 /dev/disk/by-id/…
$ qm set 100 -scsi2 /dev/disk/by-id/…
$ …

I'm sure this is fine.

Hours later after a reboot…

On the guest:

$ zpool list
no pools available

1. Denial

On the host:

$ zpool list
no pools available

Hmm.

$ zpool import tank
cannot import 'tank': I/O error
	Destroy and re-create the pool from
	a backup source.

Uh oh.

$ zpool import -F tank
cannot import 'tank': one or more devices is currently unavailable

This is not good.

2. Anger

My restic backups are stale because of some issues with my homelab. 🤦‍♂️

$ zpool import -N -o readonly=on -f tank
cannot import 'tank': I/O error
	Destroy and re-create the pool from
	a backup source.

Destroy and re-create the pool from a backup source.

At this point, most forums appear to suggest that the pool is lost.

3. Bargaining

Readonly should have worked 🤔

$ zpool import -N -o readonly=on -f -R tank
   pool: tank
     id: …
  state: ONLINE
status: Some supported features are not enabled on the pool.
	(Note that they may be intentionally disabled if the
	'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
	some features will not be available without an explicit 'zpool upgrade'.
 config:

	tank                        ONLINE
	  raidz2-0                  ONLINE
	    …                       ONLINE
	    …                       ONLINE
	    …                       ONLINE
	    …                       ONLINE
$ zpool import -F

Same output as above.

Online seems good, right?

$ zpool status
no pools available
$ zpool import -F -m tank
cannot import 'tank': one or more devices is currently unavailable

Well, here we go. Let's find the txg to use for a rollback.

$ zpool import -FX tank
# seemingly hanging for a while…
^C^C^C^C

That option must not work (forgive me, dear reader).

4. Depression

At this point I pull down the latest snapshot from Backblaze and assess the damage.

$ zdb tank
zdb: can't open 'tank': No such file or directory

ZFS_DBGMSG(zdb) START:
ZFS_DBGMSG(zdb) END

What have I done.

5. Acceptance

$ restic snapshots
repository … opened (version 2, compression level auto)
ID        Time                 Host           Tags                   Paths
--------------------------------------------------------------------------
20ee6d7b  …                    restic-remote  restic                 /data

Deep breath.

$ restic restore 20ee6d7b --target ./data

6. ?

Ok, wait a minute. Let's try that weird -X flag again.

$ zpool import -FX tank
# … waiting … staring … go get dinner … waiting … put baby to bed …

Exit 0!

$ ls /mnt/tank
files files files!

Immediately:

$ rsync -ahP /mnt/tank elsewhere:/mnt/pond/tank

Now, I pass the HBA controller to guest VMs instead of the disks when using ZFS.

The end.

Thank you FreeBSD, Truenas, and r/zfs communities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment