Tuesday, December 3, 2013

ZFS Troubleshooting Guide

Resolving Hardware Problems

Diagnosing Potential Problems

General commands for diagnosing hardware problems are:

 zpool status

 zpool status -v

 fmdump

 fmdump -ev or fmdump -eV

 format or rmformat

Identify hardware problems with the zpool status commands. If a pool is in the DEGRADED state, use the zpool status command to identify if a disk is unavailable. For example:

  # zpool status -x

  pool: zeepool

 state: DEGRADED

status: One or more devices could not be opened.  Sufficient replicas exist for

        the pool to continue functioning in a degraded state.

action: Attach the missing device and online it using 'zpool online'.

   see: http://www.sun.com/msg/ZFS-8000-D3

 scrub: resilver completed after 0h12m with 0 errors on Thu Aug 28 09:29:43 2008

config:

        NAME                 STATE     READ WRITE CKSUM

        zeepool              DEGRADED     0     0     0

          mirror             DEGRADED     0     0     0

            c1t2d0           ONLINE       0     0     0

            spare            DEGRADED     0     0     0

              c2t1d0         UNAVAIL      0     0     0  cannot open

              c2t3d0         ONLINE       0     0     0

        spares

          c1t3d0             AVAIL 

          c2t3d0             INUSE     currently in use

errors: No known data errors

See the disk replacement example to recover from a failed disk.

Identify potential data corruption with the zpool status -v command. If only one file is corrupted, then you might choose to deal with it directly, without needing to restore the entire pool.

# zpool status -v rpool

  pool: rpool

 state: DEGRADED

status: One or more devices has experienced an error resulting in data

        corruption. Applications may be affected.

action: Restore the file in question if possible. Otherwise restore the

        entire pool from backup.

see: http://www.sun.com/msg/ZFS-8000-8A

scrub: scrub completed after 0h2m with 1 errors on Tue Mar 11 13:12:42 2008

config:

       NAME        STATE     READ WRITE CKSUM

       rpool    DEGRADED     0     0     9

         c2t0d0s0  DEGRADED     0     0     9

errors: Permanent errors have been detected in the following files:

           /mnt/root/lib/amd64/libc.so.1

*  Display the list of suspected faulty devices using the fmdump command. It is also useful to know the diagnosis engines available on your system and how busy they have been, which is obtained via the fmstat command. Similarly, fmadm will show the status of the diagnosis engines. You can also see that there are 4 diagnosis engines which are appropriate to devices and ZFS: disk-transport, io-retire, zfs-diagnosis, and zfs-retire. Check your OS release for the available FMA diagnosis engine capability.

# fmdump

TIME                 UUID                                 SUNW-MSG-ID

Aug 18 18:32:48.1940 940422d6-03fb-4ea0-b012-aec91b8dafd3 ZFS-8000-D3

Aug 21 06:46:18.5264 692476c6-a4fa-4f24-e6ba-8edf6f10702b ZFS-8000-D3

Aug 21 06:46:18.7312 45848a75-eae5-66fe-a8ba-f8b8f81deae7 ZFS-8000-D3

# fmstat

module             ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz

cpumem-retire            0       0  0.0    0.0   0   0     0     0      0      0

disk-transport           0       0  0.0   55.9   0   0     0     0    32b      0

eft                      0       0  0.0    0.0   0   0     0     0   1.2M      0

fabric-xlate             0       0  0.0    0.0   0   0     0     0      0      0

fmd-self-diagnosis       0       0  0.0    0.0   0   0     0     0      0      0

io-retire                0       0  0.0    0.0   0   0     0     0      0      0

snmp-trapgen             0       0  0.0    0.0   0   0     0     0    32b      0

sysevent-transport       0       0  0.0 4501.8   0   0     0     0      0      0

syslog-msgs              0       0  0.0    0.0   0   0     0     0      0      0

zfs-diagnosis            0       0  0.0    0.0   0   0     0     0      0      0

zfs-retire               0       0  0.0    0.0   0   0     0     0      0      0

# fmadm config

MODULE                   VERSION STATUS  DESCRIPTION

cpumem-retire            1.1     active  CPU/Memory Retire Agent

disk-transport           1.0     active  Disk Transport Agent

eft                      1.16    active  eft diagnosis engine

fabric-xlate             1.0     active  Fabric Ereport Translater

fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis

io-retire                2.0     active  I/O Retire Agent

snmp-trapgen             1.0     active  SNMP Trap Generation Agent

sysevent-transport       1.0     active  SysEvent Transport Agent

syslog-msgs              1.0     active  Syslog Messaging Agent

zfs-diagnosis            1.0     active  ZFS Diagnosis Engine

zfs-retire               1.0     active  ZFS Retire Agent

*  Display more details about potential hardware problems by examining the error reports with fmdump -ev. Display even more details with fmdump -eV.

# fmdump -eV

TIME                           CLASS

Aug 18 2008 18:32:35.186159293 ereport.fs.zfs.vdev.open_failed

nvlist version: 0

        class = ereport.fs.zfs.vdev.open_failed

        ena = 0xd3229ac5100401

        detector = (embedded nvlist)

        nvlist version: 0

                version = 0x0

                scheme = zfs

                pool = 0x4540c565343f39c2

                vdev = 0xcba57455fe08750b

        (end detector)

        pool = whoo

        pool_guid = 0x4540c565343f39c2

        pool_context = 1

        pool_failmode = wait

        vdev_guid = 0xcba57455fe08750b

        vdev_type = disk

        vdev_path = /dev/ramdisk/rdx

        parent_guid = 0x4540c565343f39c2

        parent_type = root

        prev_state = 0x1

        __ttl = 0x1

        __tod = 0x48aa22b3 0xb1890bd

*  If expected devices can't be displayed with the format or rmformat utility, then those devices won't be visible to ZFS.

Disk Replacement Example

*  To be supplied.

Problems after Disk Replacement

*  If the replaced disk is not visible in the zpool status output, make sure all cables are reconnected properly.

Solving Mirrored {Root} Pool Problems (zpool attach)

*  If you cannot attach a disk to create a mirrored root or non-root pool with the zpool attach command, and you see messages similar to the following:

# zpool attach rpool c1t1d0s0 c1t0d0s0

cannot attach c1t0d0s0 to c1t1d0s0: new device must be a single disk

*  If the system is booted under a virtualization product, such as Xvm, when this problem occurs, make sure the devices are accessible by ZFS outside of the virtualization product.

*  Then, solve the device configuration problems within the virtualization product.

Panic/Reboot/Pool Import Problems

During the boot process, each pool must be opened, which means that pool failures might cause a system to enter into a panic-reboot loop. In order to recover from this situation, ZFS must be informed not to look for any pools on startup.

Boot From Milestone=None Recovery Method

*  Boot to the none milestone by using the -m milestone=none boot option.

ok boot -m milestone=none

*  Remount your root file system as writable.

*  Rename or move the /etc/zfs/zpool.cache file to another location.

   These actions cause ZFS to forget that any pools exist on the system, preventing it from trying to access the bad pool causing the

   problem. If you have multiple pools on the system, do these additional steps:

   * Determine which pool might have issues by using the fmdump -eV command to display the pools with reported fatal errors.

   * Import the pools one-by-one, skipping the pools that are having issues, as described in the fmdump output.

*  If the system is back up, issue the svcadm milestone all command.

Boot From OpenSolaris Live CD Recovery Method

Template:Draft

If you are running a Solaris SXCE or Solaris 10 release, you might be able to boot from the OpenSolaris Live CD and fix whatever is causing the pool import to fail.

*  Boot from the OpenSolaris Live CD

*  Import the pool

*  Resolve the issue that causes the pool import to fail, such as replace a failed disk

*  Export the pool (?)

*  Boot from the original Solaris release

*  Import the pool

Capacity Problems with Storage Array LUNs

*  If you resize a LUN from a storage array and the zpool status command doesn't display the LUN's expected capacity, export and import the pool to see expected capacity. This is CR xxxxxxx.

*  If zpool status doesn't display the array's LUN expected capacity, confirm that the expected capacity is visible from the format utility. For example, the format output below shows that one LUN is configured as 931.01 Gbytes and one is configured as 931.01 Mbytes.

          2. c6t600A0B800049F93C0000030A48B3EA2Cd0 <SUN-LCSM100_F-0670-931.01GB>

             /scsi_vhci/ssd@g600a0b800049f93c0000030a48b3ea2c

          3. c6t600A0B800049F93C0000030D48B3EAB6d0 <SUN-LCSM100_F-0670-931.01MB>

             /scsi_vhci/ssd@g600a0b800049f93c0000030d48b3eab6

*  You will need to reconfigure the array's LUN capacity with the array sizing tool to correct this sizing problem.

*  When the LUN sizes are resolved, export and import the pool if the pool has already been created with these LUNs.

Resolving Software Problems

Unsupported CIFS properties in Solaris 10 10/08 Release

*  The Solaris 10 10/08 release includes modifications to support the Solaris CIFS environment as described in zfs.1m. However, the CIFS features are not supported in the Solaris 10 release. Therefore, these properties are set to read-only values. If you attempt to reset the CIFS-related properties, you will see a message similar to the following:

# zfs set utf8only=on rpool

cannot set property for 'rpool': 'utf8only' is readonly

Unsupported Cache Devices in Solaris 10 10/08 Release

*  The Solaris 10 10/08 release identifies cache device support is available by using the zpool upgrade -v command. For example:

# zpool upgrade -v

This system is currently running ZFS pool version 10.

The following versions are supported:

VER  DESCRIPTION

---  --------------------------------------------------------

 1   Initial ZFS version

 2   Ditto blocks (replicated metadata)

 3   Hot spares and double parity RAID-Z

 4   zpool history

 5   Compression using the gzip algorithm

 6   bootfs pool property

 7   Separate intent log devices

 8   Delegated administration

 9   refquota and refreservation properties

 10  Cache devices

*  However, cache devices are not supported in this release.

*  If you attempt to add a cache device to a ZFS storage pool when the pool is created, the following message is displayed:

# zpool create pool mirror c0t1d0 c0t2d0 cache c0t3d0

cannot create 'pool': operation not supported on this type of pool

*  If you attempt to add a cache device to a ZFS storage pool after the pool is created, the following message is displayed:

# zpool create pool mirror c0t1d0 c0t2d0

# zpool add pool cache c0t3d0

cannot add to 'pool': pool must be upgraded to add these vdevs

ZFS Installation Issues

Review Solaris 10 10/08 Installation Requirements

*  768 Mbytes is the minimum amount of memory required to install a ZFS root file system

*  1 Gbyte of memory is recommended for better overall ZFS performance

*  At least 16 Gbytes of disk space is recommended

Before You Start

*  Due to an existing boot limitation, disks intended for a bootable ZFS root pool must be created with disk slices and must be labeled with a VTOC (SMI) disk label.

*  If you relabel EFI-labeled disks with VTOC labels, be sure that the desired disk space for the root pool is in the disk slice that will be used to create the bootable ZFS pool.

Solaris/ ZFS Initial Installation

*  For the OpenSolaris 2008.05 release, a ZFS root file system is installed by default and there is no option to choose another type of root file system.

*  For the SXCE and Solaris 10 10/08 releases, you can only install a ZFS root file system from the text installer.

*  You cannot use a Flash install or the standard upgrade option to install or migrate to a ZFS root file system. Stay tuned, more work is in progress on improving installation.

*  For the SXCE and Solaris 10 10/08 releases, you can use LiveUpgrade to migrate a UFS root file system to a ZFS root file system.

*  Access the text installer as follows:

 * On SPARC based system, use the following syntax from the Solaris installation DVD or the network:

 ok boot cdrom - text

 ok boot net - text

 * On an x86 based system, select the text-mode install option when presented.

*  If you accidentally start the GUI install method, do the following:

 * Exit the GUI installer

 * Expand the terminal window to 80 x 24

 * Unset the DISPLAY, like this:

   # DISPLAY=

   # export DISPLAY

   # install-solaris

ZFS Root Pool Recommendations and Requirements

*  During an initial installation, select two disks to create a mirrored root pool.

*  Or, you can also attach a disk to create a mirrored root pool after installation. See the ZFS Administration Guide for details.

*  Solaris VTOC labels are required for disks in the root pool which should be configured using a slice specification. EFI labeled disks do not work. Several factors are at work here, including BIOS support for booting from EFI labeled disks.

*  Note: If you mirror the boot disk later, make sure you specify a bootable slice and not the whole disk because the latter may try to install an EFI label.

*  You cannot use a RAID-Z configuration for a root pool. Only single-disk pools or pools with mirrored disks are supported. You will see the following message if you attempt to use an unsupported pool for the root pool:

ERROR: ZFS pool <pool-name> does not support boot environments

*  Root pools cannot have a separate log device. For example:

# zpool add -f rpool log c0t6d0s0

cannot add to 'rpool': root pool can not have multiple vdevs or separate logs

*  The lzjb compression property is supported for root pools but the other compression types are not supported.

*  Keep a second ZFS BE for recovery purposes. You can boot from alternate BE if the primary BE fails. For example:

# lucreate -n ZFS2BE

*  Keep root pool snapshots on a remote system. See the steps below for details.

Solaris Live Upgrade Migration Scenarios

*  You can use the Solaris Live Upgrade feature to migrate a UFS root file system to a ZFS root file system.

*  You can't use Solaris Live Upgrade to migrate a ZFS boot environment (BE) to UFS BE.

*  You can't use Solaris Live Upgrade to migrate non-root or shared UFS file systems to 

ZFS file systems.

 Review LU Requirements

*  You must be running the SXCE, build 90 release or the Solaris 10 10/08 release to use LU to migrate a UFS root file system to a ZFS root file system.

*  You must create a ZFS storage pool that contains disk slices before the LU migration.

*  The pool must exist either on a disk slice or on disk slices that are mirrored, but not on a RAID-Z configuration or on a nonredundant configuration of multiple disks. If you attempt to use an unsupported pool configuration during a Live Upgrade migration, you will see a message similar to the following:

ERROR: ZFS <pool-name> does not support boot environments

*  If you see this message, then either the pool doesn't exist or its an unsupported configuration.

Live Upgrade Issues

*  The Solaris installation GUI's standard-upgrade option is not available for migrating from a UFS to a ZFS root file system. To migrate from a UFS file system, you must use Solaris Live Upgrade.

*  You cannot use Solaris Live Upgrade to create a UFS BE from a ZFS BE.

*  Do not rename your ZFS BEs with the zfs rename command because the Solaris Live Upgrade feature is unaware of the name change. Subsequent commands, such as ludelete, will fail. In fact, do not rename your ZFS pools or file systems if you have existing BEs that you want to continue to use.

*  Solaris Live Upgrade creates the datasets for the BE and ZFS volumes for the swap area and dump device but does not account for any existing dataset property modifications. Thus, if you want a dataset property enabled in the new BE, you must set the property before the lucreate operation. For example:

 zfs set compression=on rpool/ROOT

*  When creating an alternative BE that is a clone of the primary BE, you cannot use the -f, -x, -y, -Y, and -z options to include or exclude files from the primary BE. You can still use the inclusion and exclusion option set in the following cases:

UFS -> UFS UFS -> ZFS ZFS -> ZFS (different pool)

*  Although you can use Solaris Live Upgrade to upgrade your UFS root file system to a ZFS root file system, you cannot use Solaris Live Upgrade to upgrade non-root or shared file systems.

Live Upgrade with Zones

Review the following supported ZFS and zones configurations. These configurations are upgradeable and patchable.

Migrate a UFS Root File System with Zones Installed to a ZFS Root File System

This ZFS zone root configuration can be upgraded or patched.

1.   Upgrade the system to the Solaris 10 10/08 release if it is running a previous Solaris 10 release.

2.   Create the root pool.

# zpool create rpool mirror c1t0d0s0 c1t1d0s0

3.   Confirm that the zones from the UFS environment are booted.

4.   Create the new boot environment.

# lucreate -n S10BE2 -p rpool

5.   Activate the new boot environment.

6.   Reboot the system.

7.   Migrate the UFS zones to the ZFS BE.

1.  Boot the zones.

2.  Create another BE within the pool.

# lucreate S10BE3

3.  Activate the new boot environment.

# luactivate S10BE3

4.  Reboot the system.

# init 6

8.   Resolve any potential mount point problems, due to a Solaris Live Upgrade bug.

1.  Review the zfs list output and look for any temporary mount points.

2.  # zfs list -r -o name,mountpoint rpool/ROOT/s10u6

3.     

4.      NAME                               MOUNTPOINT

5.      rpool/ROOT/s10u6                   /.alt.tmp.b-VP.mnt/

6.      rpool/ROOT/s10u6/zones             /.alt.tmp.b-VP.mnt//zones

    rpool/ROOT/s10u6/zones/zonerootA   /.alt.tmp.b-VP.mnt/zones/zonerootA

The mount point for the root ZFS BE (rpool/ROOT/s10u6) should be /.

7.  Reset the mount points for the ZFS BE and its datasets.

8.  # zfs inherit -r mountpoint rpool/ROOT/s10u6

# zfs set mountpoint=/ rpool/ROOT/s10u6

9.  Reboot the system. When the option is presented to boot a specific boot environment, either in the GRUB menu or at the OpenBoot Prom prompt, select the boot environment whose mount points were just corrected.

Configure a ZFS Root File System With Zone Roots on ZFS

Set up a ZFS root file system and ZFS zone root configuration that can be upgraded or patched. In this configuration, the ZFS zone roots are created as ZFS datasets.

1.   Install the system with a ZFS root, either by using the interactive initial installation method or the Solaris JumpStart installation method.

2.   Boot the system from the newly-created root pool.

3.   Create a dataset for grouping the zone roots.

zfs create -o canmount=noauto rpool/ROOT/S10be/zones

Setting the noauto value for the canmount property prevents the dataset from being mounted other than by the explicit action of Solaris Live Upgrade and system startup code.

4.   Mount the newly-created zones container dataset.

# zfs mount rpool/ROOT/S10be/zones

The dataset is mounted at /zones.

5.   Create and mount a dataset for each zone root.

6.  # zfs create -o canmount=noauto rpool/ROOT/S10be/zones/zonerootA

# zfs mount rpool/ROOT/S10be/zones/zonerootA

7.   Set the appropriate permissions on the zone root directory.

# chmod 700 /zones/zonerootA

8.   Configure the zone, setting the zone path as follows:

9.  # zonecfg -z zoneA

10.    zoneA: No such zone configured

11.    Use 'create' to begin configuring a new zone.

12.    zonecfg:zoneA> create

    zonecfg:zoneA> set zonepath=/zones/zonerootA

13. Install the zone.

# zoneadm -z zoneA install

14. Boot the zone.

# zoneadm -z zoneA boot

Upgrade or Patch a ZFS Root File System With Zone Roots on ZFS

Upgrade or patch a ZFS root file system with zone roots on ZFS. These updates can either be a system upgrade or the application of patches.

1.   Create the boot environment to upgrade or patch.

# lucreate -n newBE

The existing boot environment, including all the zones, are cloned. New datasets are created for each dataset in the original boot environment. The new datasets are created in the same pool as the current root pool.

2.   Select one of the following to upgrade the system or apply patches to the new boot environment.

1.  Upgrade the system.

# luupgrade -u -n newBE -s /net/install/export/s10u7/latest

Where the -s option is the location of a Solaris installation medium.

2.  Apply patches to the new boot environment.

# luupgrade -t -n newBE -t -s /patchdir 139147-02 157347-14

3.   Activate the new boot environment after the updates to the new boot environment are complete.

# luactivate newBE

4.   Boot from newly-activated boot environment.

# init 6

5.   Resolve any potential mount point problems, due to a Solaris Live Upgrade bug.

1.  Review the zfs list output and look for any temporary mount points.

2.  # zfs list -r -o name,mountpoint rpool/ROOT/newBE

3.     

4.      NAME                               MOUNTPOINT

5.      rpool/ROOT/newBE                   /.alt.tmp.b-VP.mnt/

6.      rpool/ROOT/newBE/zones             /.alt.tmp.b-VP.mnt//zones

7.      rpool/ROOT/newBE/zones/zonerootA   /.alt.tmp.b-VP.mnt/zones/zonerootA

  

The mount point for the root ZFS BE (rpool/ROOT/newBE) should be /.

8.  Reset the mount points for the ZFS BE and its datasets.

9.  # zfs inherit -r mountpoint rpool/ROOT/newBE

# zfs set mountpoint=/ rpool/ROOT/newBE

10. Reboot the system. When the option is presented to boot a specific boot environment, either in the GRUB menu or at the OpenBoot Prom prompt, select the boot environment whose mount points were just corrected.

Recover from BE Removal Failure (ludelete)

§  If you use ludelete to remove an unwanted BE and it fails with messages similar to the following:

$ ludelete -f c0t1d0s0

System has findroot enabled GRUB

Updating GRUB menu default setting

Changing GRUB menu default setting to <0>

ERROR: Failed to copy file </boot/grub/menu.lst> to top level dataset for BE <c0t1d0s0>

ERROR: Unable to delete GRUB menu entry for deleted boot environment <c0t1d0s0>.

Unable to delete boot environment.

§  You might be running into the following bugs: 6718038, 6715220, 6743529

§  The workaround is as follows:

14. Edit /usr/lib/lu/lulib and in line 2934, replace the following text:

lulib_copy_to_top_dataset "$BE_NAME" "$ldme_menu" "/${BOOT_MENU}"

 with this text:

lulib_copy_to_top_dataset `/usr/sbin/lucurr` "$ldme_menu" "/${BOOT_MENU}"

15. Rerun the ludelete operation.

ZFS Boot Issues

§  Sometimes the boot process is slow. Be patient.

§  CR 6704717 – Do not place offline the primary disk in a mirrored ZFS root configuration. If you do need to offline or detach a mirrored root disk for replacement, then boot from another mirrored disk in the pool.

§  CR 6668666 - If you attach a disk to create a mirrored root pool after an initial installation, you will need to apply the boot blocks to the secondary disks. For example:

  sparc# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0

  x86# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t1d0s0

§  CR 6741743 - The boot -L command doesn't work if you migrated from a UFS root file system. Copy the bootlst command to the correct location. For example:

# cp -p /platform/`uname -m`/bootlst /rpool/platform/`uname -m`/bootlst

ZFS Boot Error Messages

§  CR 2164779 - Ignore the following krtld messages from the boot -Z command. They are harmless:

krtld: Ignoring invalid kernel option -Z.

krtld: Unused kernel arguments: `rpool/ROOT/zfs1008BE'.

Resolving ZFS Mount Point Problems That Prevent Successful Booting

The best way to change the active boot environment is to use the luactivate command. If booting the active environment fails, due to a bad patch or a configuration error, the only way to boot a different environment is by selecting that environment at boot time. You can select an alternate BE from the GRUB menu on an x86 based system or by booting it explicitly from the PROM on an SPARC based system.

Due to a bug in the Live Upgrade feature, the non-active boot environment might fail to boot because the ZFS datasets or the zone's ZFS dataset in the boot environment has an invalid mount point.

The same bug also prevents the BE from mounting if it has a separate /var dataset.

The mount points can be corrected by taking the following steps.

Resolve ZFS Mount Point Problems

21. Boot the system from a failsafe archive.

22. Import the pool.

# zpool import rpool

23. Review the zfs list output after the pool is imported, looking for incorrect temporary mount points. For example:

24.# zfs list -r -o name,mountpoint rpool/ROOT/s10u6

25.  

26.    NAME                               MOUNTPOINT

27.    rpool/ROOT/s10u6                   /.alt.tmp.b-VP.mnt/

28.    rpool/ROOT/s10u6/zones             /.alt.tmp.b-VP.mnt//zones

    rpool/ROOT/s10u6/zones/zonerootA   /.alt.tmp.b-VP.mnt/zones/zonerootA

The mount point for the root BE (rpool/ROOT/s10u6) should be /.

29. Reset the mount points for the ZFS BE and its datasets.

30.# zfs inherit -r mountpoint rpool/ROOT/s10u6

# zfs set mountpoint=/ rpool/ROOT/s10u6

31. Reboot the system. When the option is presented to boot a specific boot environment, either in the GRUB menu or at the OpenBoot Prom prompt,

 Boot From a Alternate Disk in a Mirrored ZFS Root Pool

You can boot from different devices in a mirrored ZFS root pool.

  Identify the device pathnames for the alternate disks in the mirrored root pool by reviewing the zpool status output. In the example output, disks are c0t0d0s0 and c0t1d0s0.

# zpool status

  pool: rpool

 state: ONLINE

 scrub: resilver completed after 0h6m with 0 errors on Thu Sep 11 10:55:28 2008

config:

        NAME          STATE     READ WRITE CKSUM

        rpool         ONLINE       0     0     0

          mirror      ONLINE       0     0     0

            c0t0d0s0  ONLINE       0     0     0

            c0t1d0s0  ONLINE       0     0     0

errors: No known data errors

§  If you attached the second disk in the mirror configuration after an initial installation, apply the bootblocks. For example, on a SPARC system:

# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0

§  Depending on the hardware configuration, you might need to update the OpenBoot PROM configuration or the BIOS to specify a different boot device. For example, on a SPARC system:

ok setenv boot-device /pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/disk@1

ok boot

Activating a BE Fails (OpenSolaris Releases Prior to 101a)

§  If you attached a second disk to your ZFS root pool and that disk has an EFI label, booting the BE will fail with a messages similar to the following:

# beadm activate opensolaris-2

Unable to activate opensolaris-2.

Unknown external error.

 ZFS root pool disks must contain a VTOC label. Starting in build 101a, you will be warned about adding a disk with an EFI label to the root pool.

  See the steps below to detach and relabel the disk with a VTOC label. These steps are also applicable to the Solaris Nevada (SXCE) and Solaris 10 releases.

38. Detach the disk. For example:

# zpool detach rpool c9t0d0s0

39. Relabel the disk.

40.# format -e c9t0d0s0

41.format> label

42.[0] SMI Label

43.[1] EFI Label

44.Specify Label type[1]: 0

45.Ready to label disk, continue? yes

46.format> quit

47. Attach the disk. For example:

# zpool attach rpool c9t4d0s0 c9t0d0s0

It make take some time to resilver the second disk.

ZFS Root Pool Recovery

Complete Solaris 10 10/08 Root Pool Recovery

The section describes how to create and restore root pool snapshots.

Create Root Pool Snapshots

Create root pool snapshots for recovery purposes. For example:

48. Create space on a remote system to store the snapshots.

49.remote# zfs create rpool/snaps

50.remote# zfs list

51.NAME          USED  AVAIL  REFER  MOUNTPOINT

52.rpool         108K  8.24G    19K  /rpool

53.rpool/snaps    18K  8.24G    18K  /rpool/snaps

54. Share the space to the local system.

55.remote# zfs set sharenfs='rw=local-system,root=local-system' rpool/snaps

56.# share

57.-@rpool/snaps   /rpool/snaps   sec=sys,rw=local-system,root=local-system   ""

58. Create the snapshots.

59.local# zfs snapshot -r rpool@1016

60.# zfs list

61.NAME                             USED  AVAIL  REFER  MOUNTPOINT

62.rpool                           6.15G  27.1G    94K  /rpool

63.rpool@1016                          0      -    94K  -

64.rpool/ROOT                      4.64G  27.1G    18K  legacy

65.rpool/ROOT@1016                     0      -    18K  -

66.rpool/ROOT/s10s_u6wos_07a       4.64G  27.1G  4.64G  /

67.rpool/ROOT/s10s_u6wos_07a@1016      0      -  4.64G  -

68.rpool/dump                      1.00G  27.1G  1.00G  -

69.rpool/dump@1016                     0      -  1.00G  -

70.rpool/export                      38K  27.1G    20K  /export

71.rpool/export@1016                   0      -    20K  -

72.rpool/export/home                 18K  27.1G    18K  /export/home

73.rpool/export/home@1016              0      -    18K  -

74.rpool/swap                       524M  27.6G  11.6M  -

75.rpool/swap@1016                     0      -  11.6M  -

76. Send the snapshots to the remote system.

local# zfs send -Rv rpool@1016 > /net/remote-system/rpool/snaps/rpool1016

Recreate Pool and Restore Root Pool Snapshots

In this scenario, assume the following conditions:

§  ZFS root pool cannot be recovered

§  ZFS root pool snapshots are stored on a remote system and shared over NFS

All the steps below are performed on the local system.

79. Boot from CD/DVD or the network.

ok boot net

or

ok boot cdrom

Then, exit out of the installation program.

80. Mount the remote snapshot dataset.

# mount -F nfs remote:/rpool/snaps /mnt

81. Recreate the root pool. For example:

82.# zpool create -f -o failmode=continue -R /a -m legacy -o cachefile=/etc/zfs/zpool.cache rpool c1t1d0s0

83. Restore the root pool snapshots. This step might take some time. For example:

# cat /mnt/rpool1016 | zfs receive -Fd rpool

84. Verify that the root pool datasets are restored:

85.# zfs list

86.NAME                             USED  AVAIL  REFER  MOUNTPOINT

87.rpool                           5.65G  27.6G    94K  /rpool

88.rpool@1016                          0      -    94K  -

89.rpool/ROOT                      4.64G  27.6G    18K  legacy

90.rpool/ROOT@1016                     0      -    18K  -

91.rpool/ROOT/s10s_u6wos_07a       4.64G  27.6G  4.64G  /

92.rpool/ROOT/s10s_u6wos_07a@1016  4.53M      -  4.63G  -

93.rpool/dump                      1.00G  27.6G  1.00G  -

94.rpool/dump@1016                   16K      -  1.00G  -

95.rpool/export                      54K  27.6G    20K  /export

96.rpool/export@1016                 16K      -    20K  -

97.rpool/export/home                 18K  27.6G    18K  /export/home

98.rpool/export/home@1016              0      -    18K  -

99.rpool/swap                      11.6M  27.6G  11.6M  -

100.   rpool/swap@1016                     0      -  11.6M  -

101.            Set the bootfs property.

# zpool set bootfs=rpool/ROOT/s10s_u6wos_07 rpool

102.            Reboot the system.

# init 6

Rolling Back a Root Pool Snapshot From a Failsafe Boot

This procedure assumes that existing root pool snapshots are available. In this example, the root pool snapshots are available on the local system.

# zfs snapshot -r rpool/ROOT@1013

# zfs list

NAME                      USED  AVAIL  REFER  MOUNTPOINT

rpool                    5.67G  1.04G  21.5K  /rpool

rpool/ROOT               4.66G  1.04G    18K  /rpool/ROOT

rpool/ROOT@1013              0      -    18K  -

rpool/ROOT/zfs1008       4.66G  1.04G  4.66G  /

rpool/ROOT/zfs1008@1013      0      -  4.66G  -

rpool/dump                515M  1.04G   515M  -

rpool/swap                513M  1.54G    16K  -

103.            Shutdown the system and boot failsafe mode.

104.   ok boot -F failsafe

105.   Multiple OS instances were found. To check and mount one of them

106.   read-write under /a, select it from the following list. To not mount

107.   any, select 'q'.

108.   

109.     1  /dev/dsk/c0t1d0s0             Solaris 10 xx SPARC

110.     2  rpool:5907401335443048350     ROOT/zfs1008

111.   

112.   Please select a device to be mounted (q for none) [?,??,q]: 2

113.   mounting rpool on /a

114.            Roll back the root pool snapshots.

# zfs rollback -rf rpool/ROOT@1013

115.            Reboot back to multiuser mode.

# init 6

Primary Mirror Disk in a ZFS Root Pool is Unavailable or Fails

§  If the primary disk in the pool fails, you will need to boot from the secondary disk by specifying the boot path. For example, on a SPARC system, a devalias is available to boot from the second disk as disk1.

ok boot disk1

§  While booted from a secondary disk, physically replace the primary disk. For example, c0t0d0s0.

§  Put a VTOC label on the new disk with format -e.

§  Let ZFS know the primary disk was physically replaced at the same location.

# zpool replace rpool c0t0d0s0

§  If the zpool replace step fails, detach and attach the primary mirror disk:

# zpool detach rpool c0t0d0s0

# zpool attach rpool c0t1d0s0 c0t0d0s0

§  Confirm that the disk is available and the pool is resilvered.

# zpool status rpool

§  Replace the bootblocks on the primary disk.

# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t0d0s0

§  Confirm that you can boot from the primary disk.

ZFS Swap and Dump Devices

During an initial installation or a Solaris Live Upgrade from a UFS file system, a swap area is created on a ZFS volume in the ZFS root pool. The swap area size is based on half the size of physical memory, but no more than 2 Gbytes and no less than 512 Mbytes. During an initial installation or a Solaris Live Upgrade from a UFS file system, a dump device is created on a ZFS volume in the ZFS root pool. The dump device size is based on half the size of physical memory, but no more than 2 Gbytes and no less than 512 Mbytes.

# zfs list

NAME                   USED  AVAIL  REFER  MOUNTPOINT

rpool                 5.66G  27.6G  21.5K  /rpool

rpool/ROOT            4.65G  27.6G    18K  /rpool/ROOT

rpool/ROOT/zfs1008BE  4.65G  27.6G  4.65G  /

rpool/dump             515M  27.6G   515M  -

rpool/swap             513M  28.1G    16K  -

Resizing ZFS Swap and Dump Devices

§  You can adjust the size of your swap and dump volumes during an initial installation.

§  You can create and size your swap and dump volumes before you do a Solaris Live Upgrade operation. ZFS dump volume performance is better when the volume is created with a 128-Kbyte block size. In SXCE, build 102, ZFS dump volumes are automatically created with a 128-Kbyte block size (CR 6725698). For example:

# zpool create rpool mirror c0t0d0s0 c0t1d0s0

/* The Solaris 10 10/08 dump creation syntax would be:

# zfs create -V 2G -b 128k rpool/dump

/* The SXCE build 102 dump creation syntax would be:

# zfs create -V 2G rpool/dump

SPARC# zfs create -V 2G -b 8k rpool/swap

x86# zfs create -V 2G -b 4k rpool/swap

§  Solaris Live Upgrade does not resize existing swap and dump volumes. You can reset the volsize property of the swap and dump devices after a system is installed. For example:

# zfs set volsize=2G rpool/dump

# zfs get volsize rpool/dump

NAME        PROPERTY  VALUE       SOURCE

rpool/dump  volsize   2G          -

§  You can adjust the size of the swap and dump volumes in a JumpStart profile by using profile syntax similar to the following:

install_type initial_install

cluster SUNWCXall

pool rpool 16g 2g 2g c0t0d0s0

In this profile, the 2g and 2g entries set the size of the swap area and dump device as 2 Gbytes and 2 Gbytes, respectively.

§  You can adjust the size of your dump volume, but it might take some time, depending on the size of the dump volume. For example:

# zfs set volsize=2G rpool/dump

# zfs get volsize rpool/dump

NAME        PROPERTY  VALUE       SOURCE

rpool/dump  volsize   2G          -

Adjusting the Size of the Swap Volume on an Active System

If you need to adjust the size of the swap volume after installation on an active system, review the following steps. See CR 6765386 for more information.

14. If your swap device is in use, then you might not be able to delete it. Check to see if the swap area is in use. For example:

15.# swap -l

16.swapfile                 dev    swaplo   blocks     free

/dev/zvol/dsk/rpool/swap 182,2         8  4194296  4194296

In the above output, blocks == free, so the swap device is not actually being used.

17. If the swap area is not is use, remove the swap area. For example:

# swap -d /dev/zvol/dsk/rpool/swap

18. Confirm that the swap area is removed.

19.# swap -l

No swap devices configured

20. Recreate the swap volume, resetting the size. For example:

# zfs set volsize=1G rpool/swap

21. Activate the swap area.

22.# swap -a /dev/zvol/dsk/rpool/swap

23.# swap -l

24.swapfile                 dev    swaplo   blocks     free

/dev/zvol/dsk/rpool/swap 182,2         8  2097144  2097144

Destroying a Pool With an Active Dump/Swap Device

If you want to destroy a ZFS root pool that is no longer needed, but it still has an active dump device and swap area, you'll need to use the dumpadm and swap commands to remove the dump device and swap area. Then, use these commands to establish a new dump device and swap area.

# dumpadm -d swap

# dumpadm -d none

< destroy the root pool >

# swap -a <device-name>

# dumpadm -d swap
Your message here.

No comments:

Post a Comment