Description
A problem with L2ARC and ARC synchronization may cause a rewrite of metadata that was cached in L2ARC to be lost.
This inconsistency may cause an immediate panic, in which case there is a chance that the corruption does not get written to disk and the system will recover automatically after the panic.
However, if the corrupted data is written to disk and is later accessed by a critical function (syncing context), the system may enter a panic loop.
Alternatively, the issue may manifest itself as checksum errors on metadata reported by scrub which may lead to panics later when that piece of metadata is needed to perform work in a critical context.
Occurrence
This issue can occur in the following releases:
SPARC Platform
•Solaris 10 with patch 147147-26 and without patch 150125-01
•Solaris 11
•Solaris 11.1 without SRU 3.4
x86 Platform
•Solaris 10 with patch 147148-26 and without patch 149637-02
•Solaris 11
•Solaris 11.1 without SRU 3.4
Note: Solaris 8 and 9 are not impacted by this issue.
Systems may be impacted by this issue if they have a cache device. To determine if a system has this configuration, execute the following command:
# zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c4t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c9t0d1 ONLINE 0 0 0
c9t0d2 ONLINE 0 0 0
cache
c9t0d0 ONLINE 0 0 0
The system from the example above has two ZFS storage pools, 'rpool' and 'tank'; only 'tank' has the cache device. It means that this system may be impacted by this issue that can manifest itself in a variety of ways.
Symptoms
Should the described issue occur, the system will panic. The following are examples of known panic strings related to this issue:
zfs: freeing free segment (offset=7078106103808 size=512)
assertion failed: sm->sm_space == space (0x2dd924400 ==
0x2dd924600), file: ../../common/fs/zfs/space_map.c, line: 355
assertion failed: 0 == dmu_buf_hold_array(os, object, offset, size, FALSE,
FTAG, &numbufs, &dbp), file: ../../common/fs/zfs/dmu.c, line: 948
One known manifestation can be detected by following these steps:
1. export the pool:
# zpool export <pool>
2. run the following command:
# /sbin/zdb -emm <pool>
3. import the pool again:
# zpool import <pool>
If the zdb(1M) command fails or crashes, the pool is likely affected by one of the manifestations of the issue.
Workaround
This issue is addressed in the following releases:
SPARC Platform
•Solaris 10 with patch 150125-01 or later
•Solaris 11.1 with SRU 3.4 or later
x86 Platform
•Solaris 10 with patch 149637-02 or later
•Solaris 11.1 with SRU 3.4 or later
Solaris 11 systems need to be upgraded to Solaris 11.1 and install SRU 3.4 or later to get the fix.
A problem with L2ARC and ARC synchronization may cause a rewrite of metadata that was cached in L2ARC to be lost.
This inconsistency may cause an immediate panic, in which case there is a chance that the corruption does not get written to disk and the system will recover automatically after the panic.
However, if the corrupted data is written to disk and is later accessed by a critical function (syncing context), the system may enter a panic loop.
Alternatively, the issue may manifest itself as checksum errors on metadata reported by scrub which may lead to panics later when that piece of metadata is needed to perform work in a critical context.
Occurrence
This issue can occur in the following releases:
SPARC Platform
•Solaris 10 with patch 147147-26 and without patch 150125-01
•Solaris 11
•Solaris 11.1 without SRU 3.4
x86 Platform
•Solaris 10 with patch 147148-26 and without patch 149637-02
•Solaris 11
•Solaris 11.1 without SRU 3.4
Note: Solaris 8 and 9 are not impacted by this issue.
Systems may be impacted by this issue if they have a cache device. To determine if a system has this configuration, execute the following command:
# zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c4t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c9t0d1 ONLINE 0 0 0
c9t0d2 ONLINE 0 0 0
cache
c9t0d0 ONLINE 0 0 0
The system from the example above has two ZFS storage pools, 'rpool' and 'tank'; only 'tank' has the cache device. It means that this system may be impacted by this issue that can manifest itself in a variety of ways.
Symptoms
Should the described issue occur, the system will panic. The following are examples of known panic strings related to this issue:
zfs: freeing free segment (offset=7078106103808 size=512)
assertion failed: sm->sm_space == space (0x2dd924400 ==
0x2dd924600), file: ../../common/fs/zfs/space_map.c, line: 355
assertion failed: 0 == dmu_buf_hold_array(os, object, offset, size, FALSE,
FTAG, &numbufs, &dbp), file: ../../common/fs/zfs/dmu.c, line: 948
One known manifestation can be detected by following these steps:
1. export the pool:
# zpool export <pool>
2. run the following command:
# /sbin/zdb -emm <pool>
3. import the pool again:
# zpool import <pool>
If the zdb(1M) command fails or crashes, the pool is likely affected by one of the manifestations of the issue.
Workaround
This issue is addressed in the following releases:
SPARC Platform
•Solaris 10 with patch 150125-01 or later
•Solaris 11.1 with SRU 3.4 or later
x86 Platform
•Solaris 10 with patch 149637-02 or later
•Solaris 11.1 with SRU 3.4 or later
Solaris 11 systems need to be upgraded to Solaris 11.1 and install SRU 3.4 or later to get the fix.
No comments:
Post a Comment