Friday, December 13, 2013

Certain Solaris 10 Patches and Solaris 11 Deliver Support for 1MB Block Sizes That may Cause Poor Synchronous Write Performance With Small zpools(1M)

Description
A change in pool version to 32 to support block sizes of 1MB as delivered in Solaris 11 and in patches 147147-26 and 147148-26 for Solaris 10, causes systems performing synchronous writes to allocate blocks of 1MB which, if the pool size is small or nearly full, will rapidly fail due to the limited number of contiguous 1MB blocks available. This results in ZFS pool (zpool(1M)) IO performance degradation to the point of being unusable.

Occurrence
This issue can occur in the following releases:

SPARC Platform

•Solaris 10 with patch 147147-26
•Solaris 11
x86 Platform

•Solaris 10 with patch 147148-26
•Solaris 11
Note 1: Solaris 8 and Solaris 9 are not impacted by this issue.

Note 2: Systems are only vulnerable to this issue if they meet all four criteria listed below:

1. The file systems or ZFS volumes are using zpool version 32 or greater. To determine the zpool version, execute the following command:

    # /usr/sbin/zpool get version <pool>
    NAME  PROPERTY  VALUE  SOURCE
    tank  version   31     default
In this example, the version of the pool 'tank' is 31.

2. The pools do not have a separate log device. To determine if a pool has a separate log device, execute the following command:

    # /usr/sbin/zpool status <pool>
     ...
    NAME       STATE     READ WRITE CKSUM
    <pool>     ONLINE       0     0     0
      c2t10d0  ONLINE       0     0     0
    logs
      c2t11d0  ONLINE       0     0     0
In this example, the pool has one log device (c2t11d0).

3. At least one dataset on the pool has sync property set to 'always' or 'standard'. To determine the sync property value, execute the following command for each dataset on the pool:

    # /usr/sbin/zfs get sync <pool/dataset>
    NAME     PROPERTY  VALUE     SOURCE
    tank/fs  sync      standard  default
4. Small or nearly full pools have insufficient contiguous 1M blocks for synchronous writes. To determine if this is the case, execute the following command to show the largest chunk of free space in each metaslab. Systems with few line entries with 'maxsize' greater than 1MB are vulnerable to this issue:

    $ /usr/sbin/zdb -mm <pool> | grep maxsize
    segments        801   maxsize    433M   freepct   77%
    segments       1769   maxsize    231K   freepct    0%
    segments       1661   maxsize   4.14M   freepct    2%Symptoms
 There are no predictable symptoms that would indicate the described issue has occurred.

Workaround
If the described issue has already occurred, stop synchronous writes by executing the following command as root:

    # zfs set sync=disabled <dataset>Where <dataset> is the ZFS file system or volume that is performing synchronous writes.

Alternatively, add a log device to the pool (consult Oracle Solaris Administration: ZFS File Systems for details).


No comments:

Post a Comment