meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
linux:fs:zfs:tuning [2021/04/06 16:14]
niziak
linux:fs:zfs:tuning [2023/02/01 07:16] (current)
niziak
Line 1: Line 1:
-====== ZFS tuning tips ======+====== ZFS performance ​tuning tips ====== 
 + 
 +===== Tune L2ARC for backups ===== 
 + 
 +When huge portion of data are written (new backups) or read (backup verify) L2ARC is constantly written with current data. 
 +To change this behaviour to cache only Most Frequent Use: 
 + 
 +<file conf /​etc/​modprobe.d/​zfs.conf>​ 
 +options zfs l2arc_mfuonly=1 l2arc_noprefetch=0 
 +</​file>​ 
 + 
 +Explanation:​ 
 +  * [[https://​openzfs.github.io/​openzfs-docs/​Performance%20and%20Tuning/​Module%20Parameters.html#​l2arc-mfuonly|l2arc_mfuonly]] Controls whether only MFU metadata and data are cached from ARC into L2ARC. This may be desirable to avoid wasting space on L2ARC when reading/​writing large amounts of data that are not expected to be accessed more than once. By default both MRU and MFU data and metadata are cached in the L2ARC. 
 +  * [[https://​openzfs.github.io/​openzfs-docs/​Performance%20and%20Tuning/​Module%20Parameters.html#​l2arc-noprefetch|l2arc_noprefetch]] Disables writing prefetched, but unused, buffers to cache devices. Setting to 0 can increase L2ARC hit rates for workloads where the ARC is too small for a read workload that benefits from prefetching. Also, if the main pool devices are **very slow**, setting to 0 can improve some workloads such as **backups**.
  
 ===== I/O scheduler ===== ===== I/O scheduler =====
Line 7: Line 20:
 For rotational devices, there is no sense to use advanced schedulers ''​cfq''​ or ''​bfq''​ directly on hard disc. For rotational devices, there is no sense to use advanced schedulers ''​cfq''​ or ''​bfq''​ directly on hard disc.
 Both depends on processes, processes groups and application. In this case there is group of kernel processess for ZFS. Both depends on processes, processes groups and application. In this case there is group of kernel processess for ZFS.
-Only possible scheduler to consider is ''​deadline''​. 
  
 +Only possible scheduler to consider is ''​deadline''​ / ''​mq-deadline''​. ​
 +''​Deadline''​ scheduler group reads into batches and writed into separate batches ordering by increasing LBA address (so it should be good for HDDs).
  
 +There is a discussion on OpenZFS project to do not touch schedulers anymore and let it to be configured by admin:
 +  * [[https://​github.com/​openzfs/​zfs/​pull/​9042|Set "​none"​ scheduler if available (initramfs) #9042]]
 +  * [[https://​github.com/​openzfs/​zfs/​commit/​42c24d90d112b6e9e1a304346a1335e058f1678b]]
  
-===== postgresql ​=====+ 
 + 
 +===== Postgresql ​=====
  
 See Archlinux wiki: [[https://​wiki.archlinux.org/​index.php/​ZFS#​Databases|Databases]] See Archlinux wiki: [[https://​wiki.archlinux.org/​index.php/​ZFS#​Databases|Databases]]
Line 21: Line 40:
 zfs set logbias=throughput <​pool>/​postgres zfs set logbias=throughput <​pool>/​postgres
 </​code>​ </​code>​
 +
 +===== reduce ZFS ARC RAM usage =====
 +
 +By default ZFS can sue 50% of RAM for ARC cache:
 +<code bash>
 +# apt install zfsutils-linux
 +
 +# arcstat ​
 +    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size     ​c ​ avail
 +16:​47:​26 ​    ​3 ​    ​0 ​     0     ​0 ​   0     ​0 ​   0     ​0 ​   0   ​15G ​  ​15G ​  1.8G
 +</​code>​
 +
 +<code bash>
 +# arc_summary ​
 +
 +ARC size (current): ​                                   98.9 %   15.5 GiB
 +        Target size (adaptive): ​                      100.0 %   15.6 GiB
 +        Min size (hard limit): ​                         6.2 %  999.6 MiB
 +        Max size (high water): ​                          ​16:​1 ​  15.6 GiB
 +        Most Frequently Used (MFU) cache size:         75.5 %   11.2 GiB
 +        Most Recently Used (MRU) cache size:           24.5 %    3.6 GiB
 +        Metadata cache size (hard limit): ​             75.0 %   11.7 GiB
 +        Metadata cache size (current): ​                 8.9 %    1.0 GiB
 +        Dnode cache size (hard limit): ​                10.0 %    1.2 GiB
 +        Dnode cache size (current): ​                    5.3 %   63.7 MiB
 +</​code>​
 +
 +ARC size can be tuned by settings ''​zfs''​ kernel module parameters ([[https://​openzfs.github.io/​openzfs-docs/​Performance%20and%20Tuning/​Module%20Parameters.html#​zfs-arc-max|Module Parameters]]):​
 +  * ''​zfs_arc_max'':​ Maximum size of ARC in bytes. If set to 0 then the maximum size of ARC is determined by the amount of system memory installed (50% on Linux)
 +  * ''​zfs_arc_min'':​ Minimum ARC size limit. When the ARC is asked to shrink, it will stop shrinking at ''​c_min''​ as tuned by ''​zfs_arc_min''​.
 +  * ''​zfs_arc_meta_limit_percent'':​ Sets the limit to ARC metadata, arc_meta_limit,​ as a percentage of the maximum size target of the ARC, ''​c_max''​. Default is 75.
 +
 +Proxmox recommends following [[https://​pve.proxmox.com/​wiki/​ZFS_on_Linux#​sysadmin_zfs_limit_memory_usage|rule]]:​
 +
 +  As a general rule of thumb, allocate at least 2 GiB Base + 1 GiB/​TiB-Storage
 +
 +==== Examples ====
 +  ​
 +Set ''​zfs_arc_max''​ to 4GB and ''​zfs_arc_min''​ to 128MB:
 +<code bash>
 +echo "$[4 * 1024*1024*1024]"​ >/​sys/​module/​zfs/​parameters/​zfs_arc_max
 +echo "​$[128 ​    ​*1024*1024]"​ >/​sys/​module/​zfs/​parameters/​zfs_arc_min
 +</​code>​
 +
 +Make options persistent:
 +<file /​etc/​modprobe.d/​zfs.conf>​
 +options zfs zfs_prefetch_disable=1
 +options zfs zfs_arc_max=4294967296
 +options zfs zfs_arc_min=134217728
 +options zfs zfs_arc_meta_limit_percent=75
 +</​file>​
 +
 +and ''​update-initramfs -u''​