Académique Documents
Professionnel Documents
Culture Documents
PowerView is On
Dashboard
Knowledge
Service Requests
(138)
Rahul (Available)
Contact Us
Help
Community
How Solaris ZFS Cache Management Differs From UFS and VxFS File Systems (Doc ID 1005367.1)
APPLIES TO:
To Bottom
Document Details
GOAL
Type:
Status:
Last Major
Update:
Last Update:
Language:
HOWTO
PUBLISHED
10/10/2013
10/10/2013
English
English
ZFS manages its cache differently to other filesystems such as: UFS and VxFS. ZFS' use of kernel memory as a cache results in
higher kernel memory allocation as compared to UFS and VxFS filesystems. Monitoring a system with tools such as vmstat
would report less free memory with ZFS and may lead to unnecessary support calls.
ZFS affects the VM subsystem in terms of memory management. Monitoring of systems with vmstat(1M) and prstat(1M) would
report less free memory when ZFS is used heavily ie, copying large file(s) into a ZFS filesystem. The same load running on a
UFS filesystem would use less memory since pages that are written to the backing store will be moved into a cache list and
counted as free memory.
Document References
<Defect 6505658> - target MRU size (arc.p) needs to be adjusted more aggressively
The work-around is to limit the ZFS ARC by setting:
set zfs:zfs_arc_max
in /etc/system file. This tunable determines the maximum size of the ZFS Adaptive Replacement Cache (ARC). The default:
3/4th of memory on systems with less than 4GB or physmem minus 1GB on Systems with greater than 4GB of memory. For
databases, we know in advance how much memory they will consume. Limit ZFS's ARC to the remaining free memory (and
possibly reduce it even more) by setting zfs_arc_max tunable to desired value.
However, there are occasions when ZFS fails to evict memory from the ARC quickly which can lead to application startup failure
due to a memory shortage. Also, reaping memory from the ARC can trigger high system utilization at the expense of
performance. This issue is addressed in bug:
ZFS returns memory from the ARC only when there is a memory pressure. It is a different behaviour than pre Solaris 8 (pre
priority paging) where reading or writing one large (GBs) file can lead to memory shortage and that leads to paging/swaping of
application pages. That can result in slow application performance. This old problem was due to the failure to distinguish
between a useful application page and a filesystem cached page. See knowledge article 1003383.1
One can estimate the amount of kernel memory used for caching ZFS data blocks by running:
Information Centers
The primary ZFS cache is an Adjustable Replacement Cache (ARC) that is built on top of a number of kmem_cache's:
zio_buf_512 thru zio_buf_131072 (+ hdr_cache and buf_cache). These kmem caches are used for holding data blocks
(ZFS uses variable block sizes: 512 bytes to 1MB). The ARC will be at least 64MB in size and can use a maximum of physical
memory less 1GB. With ZFS, reported freemem will be lower than with other filesystems.
This tunable is available in Solaris 10 8/07 (Update 4) KU 120011-14 installed along with fix for <Defect 6505658>.
ZFS uses a significantly different caching model than page-based filesystems like UFS and VxFS. This is for both performance
and architectural reasons. This may impact existing application software (like Oracle that itself consumes large amount of
memory).
ZFS frees up its cache in a way that does not cause a memory shortage. The system can operate with lower freemem without
suffering a performance penalty. ZFS, unlike UFS and VxFS filesystems, does not throttle writers. The UFS filesystem throttles
writes when the number of dirty pages/vnode reaches 16MB. The objective being to preserve free memory. The downside of
this is slow application write performance that may be unnecessary when plenty of free memory is available. ZFS does not
throttle individual applications unlike UFS and VxFS. ZFS only throttles the application when the data load overflows the IO
subsystem capacity for 5 to 10 seconds. See doc
Related Products
SOLUTION
This is due to ZFS's cache management being different from UFS and VxFS filesystems. ZFS does not use the page cache,
unlike other filesystems such as UFS and VxFS. ZFS's caching is drastically different from these old filesystems where cached
pages can be moved to the cache list after being written to the backing store and thus counted as free memory.
Recently Viewed
Oracle Solaris System
Performance Analysis and
Tuning Overview
[1450811.1]
Segmap tuning for file
system performance on
Document 1005367.1
Solaris[TM], Solaris[TM] x86
[1017874.1]
Where "ZFS File Data" reports the amount of memory currently allocated in all memory caches associated with ARC file data.
This includes both memory actively in use as well as additional memory currently being held unused in the kernel memory
caches. When buffers are evicted from the ARC cache, they are returned to the respective caches (e.g. the
zio_data_buf_131072 cache is used for allocating 128k blocks). Buffers in the kernel caches will stay unused until the VM
system can reap excess capacity in these caches when the system comes under memory pressure. You can determine the
kernel memory usage of various caches by running:
Diagnostic Assistant:
General Information
[201804.1]
Master Note and Quick
Reference for Reports
Server Tuning /
Configuration Checklist
[406379.1]
Show More
Where: "size" reports amount of active data in the ARC. This value stays within the "target" size set using "zfs_arc_max"
tunable.
Also, if possible, consider using the ZFS "primarycache" property to better control what is cached in ZFS ARC. It allows
caching on a per-dataset (filesystem) basis and thus provide better ARC usage and control. If this property is set to "all", then
both user data and metadata are cached. If this property is set to "metadata", then only metadata is cached. If this property
is set to "none", then neither user data nor metadata is cached. The default is "all".
There is also a good discussion on monitoring ZFS ARC using DTrace scripts and arcstat.pl and arcstat-extended.pl tools
available.
Relief/Workaround
For Solaris 10 releases prior to 8/07 (Update 4) or without KU 120011-14 installed a script is provided in the
<Defect6505658>.
ZFS Best Practices Guide:
The ZFS adaptive replacement cache (ARC) tries to use most of a system's available memory to cache filesystem data. The
default is to use all of physical memory except 1 Gbyte. As memory pressure increases, the ARC relinquishes memory.
Consider limiting the maximum ARC memory footprint in the following situations:
When a known amount of memory is always required by an application. Databases often fall into this category.
On platforms that support dynamic reconfiguration of memory boards, to prevent ZFS from growing the kernel cage onto
all boards.
A system that requires large memory pages might also benefit from limiting the ZFS cache, which tends to breakdown
large pages into base (4KB for x86 or 8KB for SPARC) pages.
Finally, if the system is running another non-ZFS filesystem, in addition to ZFS, it is advisable to leave some free
memory to host that other filesystem's caches.
The trade off is that limiting this memory footprint means that the ARC is unable to cache as much file system data and this
limit could impact performance. In general, limiting the ARC is wasteful if the memory that would go unused by ZFS is also
unused by other system components. Note that non-ZFS filesystems typically cache data in what is nevertheless reported as
free memory by the system.
Additional information about Solaris ZFS topics is available at the Oracle Solaris ZFS Resource Center (Document
1372694.1).
Still have questions about ZFS? Consider asking them in the My Oracle Support "Oracle Solaris ZFS File System"
Community.
REFERENCES
NOTE:1369456.1 - Understanding How ZFS Calculates Used Space
NOTE:1359269.1 - ZFS Write Performance Degrades With Threads Held Up By space_map_load_wait()
NOTE:1430323.1 - How to Understand "ZFS File Data" Value by mdb and ZFS ARC Size.
Document 1005367.1
NOTE:1448052.1
NOTE:1470681.1
NOTE:1347387.1
NOTE:1404665.1
NOTE:1316513.1
Related
Products
Sun Microsystems > Operating Systems > Solaris Operating System > Solaris SPARC Operating System > ZFS > zfs, zpool, zdb
Sun Microsystems > Operating Systems > Solaris Operating System > Solaris x64/x86 Operating System > ZFS > zfs, zpool, zdb
Keywords
Translations
ARC; CACHE; FILESYSTEM; MEMORY USAGE; PAGE CACHING; PERFORMANCE; SOLARIS; UFS; VXFS; ZFS
Japanese
English Source
Back to Top
Copyright (c) 2014, Oracle. All rights reserved.
Privacy Statement