Skip to content

Instantly share code, notes, and snippets.

@dlangille
Last active November 6, 2018 00:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dlangille/b84993057299fa130f74c1c26f75b016 to your computer and use it in GitHub Desktop.
Save dlangille/b84993057299fa130f74c1c26f75b016 to your computer and use it in GitHub Desktop.
Nov 5 15:29:33 knew getty[30735]: open /dev/ttyu2: No such file or directory
Nov 5 15:58:11 knew smartd[1068]: Device: /dev/da16 [SAT], FAILED SMART self-check. BACK UP DATA NOW!
Nov 5 15:58:11 knew smartd[1068]: Device: /dev/da16 [SAT], Failed SMART usage Attribute: 240 Head_Flying_Hours.
OK, let's replace da20 (now on the system as da16)
sudo zpool replace system gpt/653BK12FFS9A.r1.c3 gpt/57NGK1ZGF57D.r1.c3
[dan@knew:~] $ zpool status system
pool: system
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Nov 5 16:00:57 2018
221G scanned out of 45.2T at 353M/s, 37h8m to go
13.0G resilvered, 0.48% done
config:
NAME STATE READ WRITE CKSUM
system ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/X643KHBFF57D.r5.c4 ONLINE 0 0 0
gpt/4728K24SF57D.r3.c2 ONLINE 0 0 0
gpt/37KVK1JRF57D.r2.c1 ONLINE 0 0 0
gpt/37D4KBJPF57D.r5.c3 ONLINE 0 0 0
gpt/5782KL6VF57D.r2.c2 ONLINE 0 0 0
gpt/6525K2DGFS9A.r2.c4 ONLINE 0 0 0
gpt/579HKDZYF57D.r3.c3 ONLINE 0 0 0
gpt/579HKDZXF57D.r2.c3 ONLINE 0 0 0
gpt/5782KL6MF57D.r3.c1 ONLINE 0 0 0
gpt/X6IEKELNF57D.r4.c4 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gpt/653BK12JFS9A.r4.c2 ONLINE 0 0 0
gpt/579IK5RMF57D.r4.c3 ONLINE 0 0 0
gpt/653EK93PFS9A.r1.c4 ONLINE 0 0 0
gpt/653DK7WPFS9A.r3.c4 ONLINE 0 0 0
gpt/653DK7WCFS9A.r4.c1 ONLINE 0 0 0
gpt/653EK93QFS9A.r5.c2 ONLINE 0 0 0
gpt/653AK2MXFS9A.r1.c2 ONLINE 0 0 0
gpt/6539K3OJFS9A.r1.c1 ONLINE 0 0 0
gpt/653IK1IBFS9A.r5.c1 ONLINE 0 0 0
replacing-9 ONLINE 0 0 0
gpt/653BK12FFS9A.r1.c3 ONLINE 0 0 0
gpt/57NGK1ZGF57D.r1.c3 ONLINE 0 0 0
errors: No known data errors
Nov 5 16:00:56 knew ZFS: vdev state changed, pool_guid=15378250086669402288 vdev_guid=1933292688604201684
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): WRITE(16). CDB: 8a 00 00 00 00 01 a3 93 11 30 00 00 00 08 00 00 length 4096 SMID 474 terminated ioc 804b loginfo 31110d00 scsi 0 state c xfer 0
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 07 f9 11 d0 00 00 60 00 length 49152 SMID 262 terminated ioc 804b loginfo 31110d00 (da15:mps1:0:30:0): WRITE(16). CDB: 8a 00 00 00 00 01 a3 93 11 30 00 00 00 08 00 00
Nov 5 16:10:54 knew kernel: scsi 0 state c xfer 0
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 07 f9 11 d0 00 00 60 00
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 07 f9 11 d0 00 00 60 00
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): CAM status: SCSI Status Error
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): SCSI status: Check Condition
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): Retrying command (per sense data)
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): WRITE(16). CDB: 8a 00 00 00 00 01 a3 93 7f e8 00 00 00 10 00 00
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): CAM status: SCSI Status Error
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): SCSI status: Check Condition
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Nov 5 16:10:54 knew kernel: (da15:mps1:0:30:0): Retrying command (per sense data)
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): WRITE(16). CDB: 8a 00 00 00 00 01 a4 17 30 10 00 00 00 08 00 00 length 4096 SMID 899 terminated ioc 804b loginfo 31110d00 scsi 0 state c xfer 0
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 41 39 c3 20 00 00 88 00 length 69632 SMID 393 terminated ioc 804b loginfo 31110d00 (da15:mps1:0:30:0): WRITE(16). CDB: 8a 00 00 00 00 01 a4 17 30 10 00 00 00 08 00 00
Nov 5 16:21:09 knew kernel: scsi 0 state c xfer 0
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 41 39 c3 20 00 00 88 00
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 41 39 c3 20 00 00 88 00
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): CAM status: SCSI Status Error
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): SCSI status: Check Condition
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Nov 5 16:21:09 knew kernel: (da15:mps1:0:30:0): Retrying command (per sense data)
Nov 5 16:21:10 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 41 39 f5 98 00 00 c8 00
Nov 5 16:21:10 knew kernel: (da15:mps1:0:30:0): CAM status: SCSI Status Error
Nov 5 16:21:10 knew kernel: (da15:mps1:0:30:0): SCSI status: Check Condition
Nov 5 16:21:10 knew kernel: (da15:mps1:0:30:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Nov 5 16:21:10 knew kernel: (da15:mps1:0:30:0): Retrying command (per sense data)
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 3f 93 92 08 00 01 00 00 length 131072 SMID 667 terminated ioc 804b loginfo 31110d00 scsi 0 state c xfer 0
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 678 terminated ioc 804b loginfo 3(da15:mps1:0:30:0): WRITE(10). CDB: 2a 00 3f 93 92 08 00 01 00 00
Nov 5 16:21:28 knew kernel: 1110d00 scsi 0 state c xfer 0
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): CAM status: CCB request completed with an error
Nov 5 16:21:28 knew kernel: (da15:mps1:0:30:0): Retrying command
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): CAM status: SCSI Status Error
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): SCSI status: Check Condition
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): Error 6, Retries exhausted
Nov 5 16:21:29 knew kernel: (da15:mps1:0:30:0): Invalidating pack
Nov 5 16:21:29 knew ZFS: vdev state changed, pool_guid=15378250086669402288 vdev_guid=1933292688604201684
Nov 5 16:21:29 knew ZFS: vdev state changed, pool_guid=15378250086669402288 vdev_guid=1933292688604201684
bash: [dan@knew:~]: command not found
[dan@knew:~] $ zpool status system
pool: system
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Nov 5 16:00:57 2018
560G scanned out of 45.2T at 360M/s, 36h9m to go
26.0G resilvered, 1.21% done
config:
NAME STATE READ WRITE CKSUM
system DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
gpt/X643KHBFF57D.r5.c4 ONLINE 0 0 0
gpt/4728K24SF57D.r3.c2 ONLINE 0 0 0
gpt/37KVK1JRF57D.r2.c1 ONLINE 0 0 0
gpt/37D4KBJPF57D.r5.c3 ONLINE 0 0 0
gpt/5782KL6VF57D.r2.c2 ONLINE 0 0 0
gpt/6525K2DGFS9A.r2.c4 ONLINE 0 0 0
gpt/579HKDZYF57D.r3.c3 ONLINE 0 0 0
gpt/579HKDZXF57D.r2.c3 ONLINE 0 0 0
gpt/5782KL6MF57D.r3.c1 ONLINE 0 0 0
gpt/X6IEKELNF57D.r4.c4 ONLINE 0 0 0
raidz2-1 DEGRADED 0 0 0
gpt/653BK12JFS9A.r4.c2 ONLINE 0 0 0
gpt/579IK5RMF57D.r4.c3 ONLINE 0 0 0
gpt/653EK93PFS9A.r1.c4 ONLINE 0 0 0
gpt/653DK7WPFS9A.r3.c4 ONLINE 0 0 0
gpt/653DK7WCFS9A.r4.c1 ONLINE 0 0 0
gpt/653EK93QFS9A.r5.c2 ONLINE 0 0 0
gpt/653AK2MXFS9A.r1.c2 ONLINE 0 0 0
gpt/6539K3OJFS9A.r1.c1 ONLINE 0 0 0
gpt/653IK1IBFS9A.r5.c1 ONLINE 0 0 0
replacing-9 DEGRADED 0 0 0
gpt/653BK12FFS9A.r1.c3 ONLINE 0 0 0
gpt/57NGK1ZGF57D.r1.c3 FAULTED 6 111 0 too many errors
errors: No known data errors
[dan@knew:~] $
@dlangille
Copy link
Author

All good now.

[dan@knew:~] $ zpool status system
  pool: system
 state: ONLINE
  scan: resilvered 630G in 6h59m with 0 errors on Mon Nov  5 23:00:14 2018
config:

	NAME                          STATE     READ WRITE CKSUM
	system                        ONLINE       0     0     0
	  raidz2-0                    ONLINE       0     0     0
	    gpt/X643KHBFF57D.r5.c4    ONLINE       0     0     0
	    gpt/4728K24SF57D.r3.c2    ONLINE       0     0     0
	    gpt/37KVK1JRF57D.r2.c1    ONLINE       0     0     0
	    gpt/37D4KBJPF57D.r5.c3    ONLINE       0     0     0
	    gpt/5782KL6VF57D.r2.c2    ONLINE       0     0     0
	    gpt/6525K2DGFS9A.r2.c4    ONLINE       0     0     0
	    gpt/579HKDZYF57D.r3.c3    ONLINE       0     0     0
	    gpt/579HKDZXF57D.r2.c3    ONLINE       0     0     0
	    gpt/5782KL6MF57D.r3.c1    ONLINE       0     0     0
	    gpt/X6IEKELNF57D.r4.c4    ONLINE       0     0     0
	  raidz2-1                    ONLINE       0     0     0
	    gpt/653BK12JFS9A.r4.c2    ONLINE       0     0     0
	    gpt/579IK5RMF57D.r4.c3    ONLINE       0     0     0
	    gpt/653EK93PFS9A.r1.c4    ONLINE       0     0     0
	    gpt/653DK7WPFS9A.r3.c4    ONLINE       0     0     0
	    gpt/653DK7WCFS9A.r4.c1    ONLINE       0     0     0
	    gpt/653EK93QFS9A.r5.c2    ONLINE       0     0     0
	    gpt/653AK2MXFS9A.r1.c2    ONLINE       0     0     0
	    gpt/6539K3OJFS9A.r1.c1    ONLINE       0     0     0
	    gpt/653IK1IBFS9A.r5.c1    ONLINE       0     0     0
	    replacing-9               ONLINE       0     0     2
	      da15p1                  ONLINE       0     0     0
	      gpt/57NGK1ZGF57D.r1.c3  ONLINE       0     0     0

errors: No known data errors
[dan@knew:~] $ 

But why:

  • the name changed after the reboot
  • the original device wasn't removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment