Skip to content

Instantly share code, notes, and snippets.

@peterdavidhamilton
Forked from dch/zfs_notes.md
Created November 7, 2013 20:26
Show Gist options
  • Save peterdavidhamilton/7361296 to your computer and use it in GitHub Desktop.
Save peterdavidhamilton/7361296 to your computer and use it in GitHub Desktop.

pool management

  • create a new pool by first creating an empty partition space using diskutil or Disk Utility

adding a mirror to an existing pool

check current status

zpool status -v

root@akai / # zpool status -v
pool: tub
state: ONLINE
scrub: none requested
config:
NAME        STATE     READ WRITE CKSUM
tub         ONLINE       0     0     0
  disk0s2   ONLINE       0     0     0
errors: No known data errors

doublecheck device names for the intended mirror

root@akai / # diskutil list
/dev/disk0
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *512.1 GB   disk0
   1:                        EFI                         209.7 MB   disk0s1
   2:                        ZFS tub                     511.8 GB   disk0s2
/dev/disk1
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *121.3 GB   disk1
   1:                        EFI                         209.7 MB   disk1s1
   2:                  Apple_HFS akai                    120.5 GB   disk1s2
   3:                 Apple_Boot Recovery HD             650.0 MB   disk1s3
/dev/disk2
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *3.0 TB     disk2
   1:                        EFI                         209.7 MB   disk2s1
   2:          Apple_CoreStorage                         385.4 GB   disk2s2
   3:                 Apple_Boot Recovery HD             650.0 MB   disk2s3
   4:                        ZFS pond                    2.6 TB     disk2s4
/dev/disk3
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:                  Apple_HFS continuity             *385.1 GB   disk3

attach a new disk to an existing pool

root@akai / # zpool attach  tub disk0s2 disk2s4

check status and wait

root@akai / # zpool status -v
pool: tub
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 0.00% done, 176h25m to go
config:
NAME         STATE     READ WRITE CKSUM
tub          ONLINE       0     0     0
  mirror     ONLINE       0     0     0
    disk0s2  ONLINE       0     0     0
    disk2s4  ONLINE       0     0     0
errors: No known data errors

root@akai / # zpool status -v
pool: tub
state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 5.80% done, 1h17m to go
config:
NAME         STATE     READ WRITE CKSUM
tub          ONLINE       0     0     0
  mirror     ONLINE       0     0     0
    disk0s2  ONLINE       0     0     0
    disk2s4  ONLINE       0     0     0
errors: No known data errors

Split the mirror

confirm resilver has completed

root@akai / # zpool status -v

  pool: tub
 state: ONLINE
 scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
    NAME         STATE     READ WRITE CKSUM
    tub          ONLINE       0     0     0
      mirror     ONLINE       0     0     0
        disk0s2  ONLINE       0     0     0
        disk2s4  ONLINE       0     0     0
errors: No known data errors

flush RAM just in case

root@akai / # sync

take the additional mirror disk offline

root@akai / # zpool offline tub disk2s4

check status

root@akai / # zpool status -v
  pool: tub
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Online the device using 'zpool online' or replace the device with
    'zpool replace'.
 scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
    NAME         STATE     READ WRITE CKSUM
    tub          DEGRADED     0     0     0
      mirror     DEGRADED     0     0     0
        disk0s2  ONLINE       0     0     0
        disk2s4  OFFLINE      0     0     0
errors: No known data errors

remove the disk

root@akai / # zpool detach tub disk2s4

check status

root@akai / # zpool status -v
  pool: tub
 state: ONLINE
 scrub: resilver completed with 0 errors on Sun Jan 13 15:52:21 2013
config:
    NAME        STATE     READ WRITE CKSUM
    tub         ONLINE       0     0     0
      disk0s2   ONLINE       0     0     0
errors: No known data errors

scrub the original volume

root@akai / # zpool scrub tub

Removable Media

ZFS works successfully using >= 32 GiB SDXC cards in Feb 2011 MacBook Pro, and likely similar models.

  • use Finder to eject disks
  • if required, use zfs unmount -f <pool> & zfs pool export <pool> to force
  • if the physical ReadOnly switch is enabled on media, zfs will fail to import them with insufficient replicas as an error:
$ zpool import
  pool: builds
    id: 11121869171413038388
 state: FAULTED
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-3C
config:

builds      UNAVAIL  insufficient replicas
  disk2s2   UNAVAIL  cannot open

I have had 3 kernel panics during busy testing, none during data writing but all later after ejection in Finder without a subsequent pool export.

The oracle suggests an alternate approach for removable media usage.

Missing functionality

If you've used zfs elsewhere, or are referring to the manpages, a few critical things are missing:

  • recursive filesystem functionality is available to create snapshots, but not in zfs send/receive
  • zfs send/receive doesn't yet support pipes so zfs send <snap> | ssh user@host "zfs receive -d <snap>" doesn't work
  • zfs sharing and exporting doesn't work (iscsi, smb, nfs, afp via apple sharing)
  • recursive send/receive of snapshots is supported on other ZFS versions

zfs send in zfs-osx fork

using mbuffer for speed

  • set up the receiving (listening) end first

mbuffer -I 192.168.1.1:10000 -q -s128k -m1G -P10 | zfs recv storage/foo

  • now set up the sending end:

zfs send foo@052209 | mbuffer -q -s128k -m1G -O 192.168.1.2:10000

Snapshots

  • As noted, there is no support for streamed snapshots
  • Nor sending or receiving recursive snapshots
  • Finally, pipes/streams don't work as expected:
$ zfs send tub/vm.freebsd@20120910 | zfs receive orange/vm.freebsd

internal error: Bad file descriptor
cannot receive: invalid stream (failed to read first record)
[1]    8408 abort      zfs send tub/vm.freebsd@20120910 | 
       8409 exit 1     zfs receive orange/vm.freebsd

You can work around the missing stream functionality using fifos.

Snapshots on localhost

PIPE=/tmp/zpipe && mkfifo $PIPE
zfs send pool1/fs@snap > $PIPE &
zfs receive -d pool2/backups < $PIPE &

Snapshots between hosts

I couldn't get the canonical netcat example to work, so I used socat, which additionally allows compression and SSL as well. Make sure that you have chmod/chown your mountpoints correctly correctly, and zfs permissions granted to the appropriate non-root users to have this work.

This example uses socat to transfer the data between two hosts, listening on TCP port 1234. You can easily add SSL support by carefully following the client/server instructions provided.

On your source host:

mkfifo /tmp/zpipe
socat GOPEN:/tmp/zpipe TCP-LISTEN:1234,reuseaddr &
zfs send tub/test@initial > /tmp/zpipe

And on your target host:

ZHOST=source.host.com
mkfifo /tmp/zpipe
socat -u TCP:$ZHOST:1234,reuseaddr GOPEN:/tmp/zpipe &
zfs receive -F pool/test </tmp/zpipe

Sending an incremental snapshot

Using the same approach as above, with an updated zfs send command on your source host:

zfs send -i tub/test@initial tub/test@current  > /tmp/zpipe

The target host remains the same.

One-liners

Here's a one-liner to move data from a source pool into a backup filesystem on a different pool, on the same host:

# get set
SRC_POOL=inpool
DEST_POOL=outpool
DEST_ZFS=backup
REF_SNAP=20120801
NEW_SNAP=20120911
# send the initial snap
zfs send $SRC_POOL@$REF_SNAP > $PIPE &; zfs receive -d $DEST_POOL/$DEST_ZFS < $PIPE &
# update to a newer one
zfs send -i @REF_SNAP $SRC_POOL@$NEW_SNAP > $PIPE &; zfs receive -F -d $DEST_POOL/$DEST_ZFS < $PIPE &

Encrypted transfer

socat also supports secured communications using openssl. Follow the instructions to set up keys and certificates, and confirm that you can use the SSL connections correctly. Note that the socat "server" will be the sending ZFS end, and the zfs receiver the socat client.

On your source host:

mkfifo /tmp/zpipe
socat GOPEN:/tmp/zpipe openssl-listen:1234,reuseaddr,cert=/etc/socat/source.pem,cafile=/etc/socat/destination.crt &
zfs send tub/test@initial > /tmp/zpipe

And on your target host:

ZHOST=source.host.com
mkfifo /tmp/zpipe
socat -u openssl-connect:$ZHOST:1234,cert=/etc/socat/destination.pem,cafile=/etc/socat/source.crt GOPEN:/tmp/zpipe &
zfs receive -F pool/test </tmp/zpipe

Running it all from the server end

This works functionally but not all the shell wiggly bits actually complete successfully. It's still WIP with socat and friends. Also I seem to have mucked up the certificate stuff below, YMMV :-(

Sending the initial snapshot:

# setup
ZSOURCE=akai.local
ZDESTINATION=dch@continuity.local
ZFILE=tub/shared/repos
ZINITIAL=$ZFILE@20121220
ZCURRENT=$ZFILE@20130114
mkfifo /tmp/zpipe
ssh $ZSOURCE "mkfifo /tmp/zpipe"
# intitiate the sender 
socat GOPEN:/tmp/zpipe openssl-listen:1234,reuseaddr,cert=/etc/socat/source.pem,cafile=/etc/socat/destination.crt &
zfs send $ZINITIAL > /tmp/zpipe &
# control the destination over ssh
ssh $ZDESTINATION "socat -u openssl-connect:$ZHOST:1234,cert=/etc/socat/destination.pem,cafile=/etc/socat/source.crt GOPEN:/tmp/zpipe" & 
ssh $ZDESTINATION "zfs receive -F $ZINITIAL </tmp/zpipe" 

Sending the current snapshot:

# initiate the server
socat GOPEN:/tmp/zpipe openssl-listen:1234,reuseaddr,cert=/etc/socat/source.pem,cafile=/etc/socat/destination.crt &
zfs send -i $ZINITIAL $ZCURRENT > /tmp/zpipe &
# control the client over ssh
ssh $ZDESTINATION "socat -u openssl-connect:$ZHOST:1234,cert=/etc/socat/destination.pem,cafile=/etc/socat/source.crt GOPEN:/tmp/zpipe" & 
ssh $ZDESTINATION "zfs receive -F $ZFILE </tmp/zpipe"

Managing ZFS filesystems

ZFS has an internal namespace (hierarchy) for filesystems, using a simple / delimiter within a filesystem name. Properties such as compression, mountpoints, and many other settings can be inherited through this namespace, or set and reset recursively. Other useful actions such as recursive snapshots are possible. Aligning these to roughly the same mapping as your filesystem will likely keep you sane & reduce your frustration.

  • Reset the mountpoints under pool "tub", filesystem "shared" to inherit from the root:
zfs inherit -r mountpoint tub/shared
  • Take snapshots of all subsidiary fileystems in pool "tub" and append the same suffix as snapshot name:
zfs snapshot -r tub@20120910
  • recursive, forced, rollback to a snapshot will destroy all intermediate snaps and clones:
sudo zfs rollback -rf <snapshot>

This zfs cheatsheet is worth printing out.

Binding Finder.app to your will

Finder and friends like spotlight want to abuse your ZFS filesystems. In particular:

  • use mdutil -i off <mountpoint> to stop finder and spotlight trying to index ZFS. It won't work.
  • stop metadata being created using cd <mountpoint> ; mkdir .fseventsd && touch .fseventsd/no_log on the root mountpoint.
  • add FILESYSTEMS="hfs ufs zfs" to the end of /etc/locate.rc to allow locate to index zfs filesystems.
mdutil -i off /zfs
cd /zfs
mkdir .fseventsd && touch .fseventsd/no_log
touch .Trashes .metadata_never_index .apdisk

Use locate instead for non-realtime searching of your z filesystems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment