====== Proxmox's ZFS ====== Since fall 2015 the default compression algorithm in ZOL is LZ4 and since choosing ''compression=on'' means activate compression using default algorithm then your pools are using LZ4 -> [[http://open-zfs.org/wiki/Performance_tuning#Compression]] # Check if LZ4 is active zpool get feature@lz4_compress rpool ===== RAM requiremens ===== ZFS base about 4GB and 1GB for each TB used disc space. this is without dedup or L2ARC ===== Glossary ===== * ZPool is the logical unit of the underlying disks, what zfs use. * ZVol is an emulated Block Device provided by ZFS * ZIL is ZFS Intent Log, it is a small block device ZFS uses to write faster * SLOG is Separate Intent Log * ARC is Adaptive Replacement Cache and located in Ram, its the Level 1 cache. * L2ARC is Layer2 Adaptive Replacement Cache and should be on an fast device (like SSD). ===== Resources ==== * [[https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks|ZFS: Tips and Tricks]] * [[https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks#Install_on_a_high_performance_system|Install on a high performance system]] ===== Tunning ==== zfs set atime=off rpool/data # or zfs set atime=on rpool/data zfs set relatime=on rpool/data ===== Adding SSD cache for HDDs ===== * SLOG can speedup synchronous only writes * The ZIL's purpose is to protect you from data loss. It is necessary because the actual ZFS write cache, which is not the ZIL, is handled by system RAM, and RAM is volatile. * In default setup of ZFS, asynchronous writes are not handled by ZIL * The ZIL doesn't need to be very big. Find the transfer speed of the fastest disk in your array and multiple by 10s, this is about how big your ZIL should be. * For HDDs 2GB of SLOG on SSD is enough. I noticed maximum usage of 1.5GB. * [[http://nex7.blogspot.com/2013/04/zfs-intent-log.html]] * [[https://www.cyberciti.biz/faq/how-to-add-zil-write-and-l2arc-read-cache-ssd-devices-in-freenas/]] * [[https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/]] * L2ARC is a read cache. L1 in memory, L2 on disk * L2ARC cache requires RAM for its metadata blkid /dev/nvme0n1p4: PARTLABEL="ZIL" PARTUUID="d6da74cd-32e7-4286-8e78-ace66ab659b2" /dev/nvme0n1p5: PARTLABEL="L2ARC" PARTUUID="60c563fc-91f8-4ec4-afc0-b7794c63f31c" zpool add rpool cache 60c563fc-91f8-4ec4-afc0-b7794c63f31c zpool add rpool log d6da74cd-32e7-4286-8e78-ace66ab659b2 zpool status pool: rpool state: ONLINE scan: scrub repaired 0B in 0 days 00:11:23 with 0 errors on Sun May 10 00:35:24 2020 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 sda ONLINE 0 0 0 logs d6da74cd-32e7-4286-8e78-ace66ab659b2 ONLINE 0 0 0 cache 60c563fc-91f8-4ec4-afc0-b7794c63f31c ONLINE 0 0 0 zpool iostat -v 1 ===== remove storage pool ===== zfs destroy rpool/data ===== create ''local-zfs'' ===== For nodes without ''local-zfs'', i.e. Debian based custom system it is possible to add ''local-zfs'' storage later. zpool create -f -o ashift=13 rpool /dev/sdb zfs set compression=lz4 rpool zfs create rpool/data # You can get a list of available ZFS filesystems with: pvesm zfsscan zpool status -v zfs list Datacenter --> Storage --> ''local-zfs'' * Disable node restriction ===== rename zfs pool ===== zpool checkpoint pve3-nvm zpool export pve3-nvm zpool import pve3-nvm nvmpool * rename storage pool and paths * verify zpool checkpoint --discard pve3-nvm ===== clean old replication snapshots ===== zfs list -t all | grep @__replicate | cut -f 1 -d ' ' | while read N; do zfs destroy ${N}; done ===== trim free space ===== # Trim with speed 50M/s zpool trim -r 50M nvmpool # And monitor progress: zpool status nvmpool -t