ZFS Cache Flushes

ZFS Cache Flushes

ZFS is designed to work with storage devices that manage a disk-level cache. ZFS commonly asks the storage device to ensure that data is safely placed on stable storage by requesting a cache flush. For JBOD storage, this works as designed and without problems. For many NVRAM-based storage arrays, a problem might come up if the array takes the cache flush request and actually does something rather than ignoring it. Some storage will flush their caches despite the fact that the NVRAM protection makes those caches as good as stable storage.

ZFS issues infrequent flushes (every 5 second or so) after the uberblock updates. The problem here is fairly inconsequential. No tuning is warranted here.

ZFS also issues a flush every time an application requests a synchronous write (O_DSYNC, fsync, NFS commit, and so on). The completion of this type of flush is waited upon by the application and impacts performance. Greatly so, in fact. From a performance standpoint, this neutralizes the benefits of having an NVRAM-based storage.

The upcoming fix is that the flush request semantic will be qualified to instruct storage devices to ignore the requests if they have the proper protection. This change requires a fix to our disk drivers and for the storage to support the updated semantics.

Since ZFS is not aware of the nature of the storage and if NVRAM is present, the best way to fix this issue is to tell the storage to ignore the requests. For more information, see:

http://blogs.digitar.com/jjww/?itemid=44.

Please check with your storage vendor for ways to achieve the same thing.

As a last resort, when all LUNs exposed to ZFS come from NVRAM-protected storage array and procedures ensure that no unprotected LUNs will be added in the future, ZFS can be tuned to not issue the flush requests. If some LUNs exposed to ZFS are not protected by NVRAM, then this tuning can lead to data loss, application level corruption, or even pool corruption.

NOTE: Cache flushing is commonly done as part of the ZIL operations. While disabling cache flushing can, at times, make sense, disabling the ZIL does not.

Solaris 10 11/06 and Solaris Nevada (snv_52) Releases

Set dynamically:

echo zfs_nocacheflush/W0t1 | mdb -kw

Revert to default:

echo zfs_nocacheflush/W0t0 | mdb -kw

Set the following parameter in the /etc/system file:

set zfs:zfs_nocacheflush = 1

Risk: Some storage might revert to working like a JBOD disk when their battery is low, for instance. Disabling the caches can have adverse effects here. Check with your storage vendor.

Earlier Solaris Releases

Set the following parameter in the /etc/system file:

set zfs:zil_noflush = 1

Set dynamically:

echo zil_noflush/W0t1 | mdb -kw

Revert to default:

echo zil_noflush/W0t0 | mdb -kw

Risk: Some storage might revert to working like a JBOD disk when their battery is low, for instance. Disabling the caches can have adverse effects here. Check with your storage vendor.

RFEs

* sd driver should set SYNC_NV bit when issuing SYNCHRONIZE CACHE to SBC-2 devices

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462690

* zil shouldn’t send write-cache-flush command …

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460889

Leave a Reply

Your email address will not be published. Required fields are marked *