Monday, September 21, 2015

How to Replace a Drive in Solaris[TM] ZFS




There are a few different cases in relation to disk replacement in Solaris ZFS:

Case 1 - LUN went offline, then came back online.
Case 2 - Replace a disk with the same target number
Case 3 - Replace a disk with different target number

The steps below illustrate how to proceed in each case.

SOLUTION

Case 1.  Drive went offline, then back online. No hardware problem. (Common problem with SAN LUNs.)

There are 2 methods we can try:
Method 1. online the drive went the LUN came back.
 The drive once went offline and the zpool got degraded.
        18. c3t54d0 <drive not available>
             /pci@1f,0/pci@1/pci@3/SUNW,qlc@5/fp@0,0/ssd@w22000020370705f1,0


# zpool status -v viper
  pool: viper
 state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
 scrub: resilver completed with 0 errors on Tue Dec 26 13:05:45 2006
config:
        NAME STATE READ WRITE CKSUM
        viper DEGRADED 0 0 0
          mirror DEGRADED 0 0 0
            c3t53d0s1 ONLINE 0 0 0
            c3t54d0s1 UNAVAIL 0 62 0 cannot open


The drive is back online.
       18. c3t54d0 <SEAGATE-ST19171FCSUN9.0G-7F78-8.43GB>
           /pci@1f,0/pci@1/pci@3/SUNW,qlc@5/fp@0,0/ssd@w22000020370705f1,0

Bring the device c3t54d0s1 online
# zpool online viper c3t54d0s1
# zpool status -v viper
  pool: viper
 state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
        attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
 scrub: resilver completed with 0 errors on Tue Dec 26 14:11:42 2006
config:
        NAME STATE READ WRITE CKSUM
        viper ONLINE 0 0 0
          mirror ONLINE 0 0 0
            c3t53d0s1 ONLINE 0 0 0
            c3t54d0s1 ONLINE 0 62 0

errors: No known data errors

To make sure there's no corruption, we can request a scrub to double check integrity.
# zpool scrub viper

Method 2. Export then import the pool. The ZFS file system will be unmounted for 1 or 2 minutes.
 The drive once went offline and the zpool got degraded.
        18. c3t54d0 <drive not available>
             /pci@1f,0/pci@1/pci@3/SUNW,qlc@5/fp@0,0/ssd@w22000020370705f1,0

# zpool status -v viper
  pool: viper
 state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
 scrub: resilver completed with 0 errors on Tue Dec 26 13:05:45 2006
config:
        NAME STATE READ WRITE CKSUM
        viper DEGRADED 0 0 0
          mirror DEGRADED 0 0 0
            c3t53d0s1 ONLINE 0 0 0
            c3t54d0s1 UNAVAIL 0 62 0 cannot open


The drive is back online.
       18. c3t54d0 <SEAGATE-ST19171FCSUN9.0G-7F78-8.43GB>
           /pci@1f,0/pci@1/pci@3/SUNW,qlc@5/fp@0,0/ssd@w22000020370705f1,0


Just export and import the pool after drive is back online.
# zpool export viper
# zpool import -f viper
# zpool status viper
  pool: viper
 state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
        attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
 scrub: none requested
config:
        NAME STATE READ WRITE CKSUM
        viper ONLINE 0 0 0
          mirror ONLINE 0 0 0
            c3t54d0 ONLINE 0 0 2
            c3t53d0 ONLINE 0 0 0


errors: No known data errors


To make sure there's no corruption, we can request a scrub to double check integrity.
# zpool scrub viper

Case 2. The drive has really failed. The new drive is in place.The new drive has same target number as the old drive.

Remove the old drive and insert the new disk. This example was done on Photon array.
# luxadm remove_device -F A,r6
WARNING!!! Please ensure that no filesystems are mounted on these device(s).
All data on these devices should have been backed up.
The list of devices which will be removed is:
1: Box Name: "A" rear slot 6
Node WWN: 20000020370705f1
Device Type:Disk device
Device Paths:
/dev/rdsk/c3t54d0s2
Please verify the above list of devices and
then enter 'c' or <CR> to Continue or 'q' to Quit. [Default: c]:
New drive with new WWN

       18. c3t54d0 <SEAGATE-ST19171FCSUN9.0G-7F7E-8.43GB>
           /pci@1f,0/pci@1/pci@3/SUNW,qlc@5/fp@0,0/ssd@w22000020372d508c,0

# zpool status
  pool: viper
 state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
 scrub: none requested
config:
        NAME STATE READ WRITE CKSUM
        viper DEGRADED 0 0 0
          mirror DEGRADED 0 0 0
            c3t54d0 UNAVAIL 0 62 0 cannot open
            c3t53d0 ONLINE 0 0 0

errors: No known data errors


Replace the disk using the zpool command. Only the original disk name is used.
# zpool replace viper c3t54d0
# zpool status viper
  pool: viper
 state: ONLINE
 scrub: resilver completed with 0 errors on Tue Dec 26 09:57:09 2006
config:
        NAME STATE READ WRITE CKSUM
        viper ONLINE 0 0 0
          mirror ONLINE 0 0 0
            c3t54d0 ONLINE 0 0 0
            c3t53d0 ONLINE 0 0 0
errors: No known data errors


Case 3. Replace the bad drive with a new drive.The new drive has difference target number.

# zpool status viper
  pool: viper
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scrub: none requested
config:
        NAME STATE READ WRITE CKSUM
        viper DEGRADED 0 0 0
          mirror DEGRADED 0 0 0
            c3t54d0 ONLINE 0 0 0
            c3t53d0 OFFLINE 0 0 0


Replace the disk using the zpool command. The old and new disk names need to b used.
# zpool replace viper c3t53d0 c3t32d0
# zpool status viper
  pool: viper
 state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 4.16% done, 0h4m to go
config:
        NAME STATE READ WRITE CKSUM
        viper DEGRADED 0 0 0
          mirror DEGRADED 0 0 0
            c3t54d0 ONLINE 0 0 0
            replacing DEGRADED 0 0 0
              c3t53d0 OFFLINE 0 0 0
              c3t32d0 ONLINE 0 0 0


errors: No known data errors


Once the new drive in place, the old drive is out of zpool configuration. ZFS does not need to talk to the bad drive in order to remove it out of zfs configuration.

# zpool status viper
  pool: viper
 state: ONLINE
 scrub: resilver completed with 0 errors on Tue Dec 26 11:07:14 2006
config:
        NAME STATE READ WRITE CKSUM
        viper ONLINE 0 0 0
          mirror ONLINE 0 0 0
            c3t54d0 ONLINE 0 0 0
            c3t32d0 ONLINE 0 0 0


errors: No known data errors


This procedure can also be applied to RAID-Z.

1 comment:

  1. Yakin anda selalu tidak hoki?? Kami tantang anda yang merasa selalu tidak hoki... Kami yakin tidak ada orang yang tidak hoki...disini akan kami adu hoki anda dengan hoki pemain lain...
    ==DONACOPOKER=
    Kami yakin anda lebih hoki bersama kami..!! sudah terbukti.!!...

    Nikmati Kemudahan DEPOSIT PULSA di Donaco Poker...
    Proses Cepat dan Mudah..

    Ayo segera bergabung dan dapatkan tips-tips menang dari kami
    Tips Jitu Poker

    Cara Daftar Donaco Poker

    Hubungi Kami Secepatnya Di :
    WHATSAPP : +6281333555662

    ReplyDelete