meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
zfs [2015/03/23 13:32] – created niziaklinux:fs:zfs [2021/05/10 20:42] (current) niziak
Line 1: Line 1:
 ====== ZFS ====== ====== ZFS ======
  
-===== Creating ZFS dataset =====+[[https://arstechnica.com/information-technology/2020/05/zfs-101-understanding-zfs-storage-and-performance/|ZFS 101—Understanding ZFS storage and performance]] 
 + 
 +[[https://lists.debian.org/debian-user/2012/05/msg01026.html]] 
 + 
 +Features: 
 +  * data pools (tanks) are abstraction aggregate block devices (simple, mirror, raidz, spares, etc) 
 +  * data set is created on data pool or another (parent) data set. 
 +  * whole data pool space is shared between dataset (no fixed partition size problem). Size of data set (and its descendants) can be limited using quota 
 +  * compression 
 +  * block level deduplication (not usable for emails with attachment, where attachment are shifted to different offset) 
 + 
 +OpenZFS2.0.0 (Dec 20) [[https://github.com/openzfs/zfs/releases/tag/zfs-2.0.0]]: 
 +  * Sequential resilver (rebuild only used by data portions) 
 +  * Persistent L2ARC cache (survives between reboots) 
 +  * ZSTD 
 +  * Redacted replication (replicate with some data excluded) 
 +  * FreeBSD and Linux unification 
 + 
 +Proposed use case: 
 +POOL created on encrypted LUKS block device.
  
 <code> <code>
 + POOL
 +   |-- /filer (quota)
 +       |- foto 
 +       |- mp3 (dedup)
 +       |- movies
 +       +- backup (copies=2, compression)
 +   |
 +   |-- /home (compression, dedup, quota)
 +   +-- /var (quota)
 +         +- log (compression)
 +</code>
 +
 +===== ZFS implementations =====
 +
 +ZFS-Fuse 0.7 is using old pool version 23, where [[http://zfsonlinux.org|ZFSonLinux]] is using pool version 28.
 +[[http://exitcode.de/?p=106|zfs-fuse vs. zfsonlinux]]
 +
 +
 +
 +===== Creating ZFS dataset =====
 +
 +<code bash>
 zpool create INBOX /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 zpool create INBOX /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
 </code> </code>
  
-<code>+<code bash>
 # zpool list # zpool list
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Line 13: Line 54:
 </code> </code>
  
-<code>+<code bash>
 # zpool status # zpool status
   pool: INBOX   pool: INBOX
Line 31: Line 72:
  
 Dataset "INBOX" is also automatically created based on zpool name "INBOX". It is mounted as /INBOX Dataset "INBOX" is also automatically created based on zpool name "INBOX". It is mounted as /INBOX
 +
 +<code bash>
 +# zfs list
 +NAME    USED  AVAIL  REFER  MOUNTPOINT
 +INBOX   400K   748M   112K  /INBOX
 +</code>
 +
 +===== Mount dataset =====
 +
 +<code>
 +zfs mount INBOX
 +</code>
 +
 +===== Create more datasets in pool =====
 +
 +<code>
 +zfs create <pool name>/<data set name>
 +</code>
 +
 +===== Add new block device (disc) to online pool =====
 +
 +<code>
 +zpool add INBOX /dev/loop4
 +</code>
 +
 +===== Deduplication =====
 +
 +<code>
 +zfs set dedup=on INBOX
 +</code>
 +New attributed applies only to newly written data.
 +
 +Tests
 +For test I was using 3 files 16MB each of random data (/dev/urandom): B1, B2 and B3
 +Above 3 files takes 38,6M on disc:
 +<code>
 +# zdb -S INBOX
 +Simulated DDT histogram:
 +
 +bucket              allocated                       referenced
 +______   ______________________________   ______________________________
 +refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
 +------   ------   -----   -----   -----   ------   -----   -----   -----
 +          309   38.6M   38.6M   38.6M      309   38.6M   38.6M   38.6M
 + Total      309   38.6M   38.6M   38.6M      309   38.6M   38.6M   38.6M
 +
 +dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.00
 +</code>
 +
 +Additionaly one big file with content B1|B2|B3 was added to filesystem:
 +<code>
 +# zdb -S INBOX
 +Simulated DDT histogram:
 +
 +bucket              allocated                       referenced
 +______   ______________________________   ______________________________
 +refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
 +------   ------   -----   -----   -----   ------   -----   -----   -----
 +          384     48M     48M     48M      768     96M     96M     96M
 + Total      384     48M     48M     48M      768     96M     96M     96M
 +
 +dedup = 2.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 2.00
 +
 +</code>
 +
 +Additionaly one big file with content B1|B2|B3|B1|B2|B3 was added to filesystem:
 +<code>
 +# zdb -S INBOX
 +Simulated DDT histogram:
 +
 +bucket              allocated                       referenced
 +______   ______________________________   ______________________________
 +refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
 +------   ------   -----   -----   -----   ------   -----   -----   -----
 +          384     48M     48M     48M    1.50K    192M    192M    192M
 + Total      384     48M     48M     48M    1.50K    192M    192M    192M
 +
 +dedup = 4.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 4.00
 +
 +</code>
 +
 +Next, new file with content 0|B1|B2|B3 (one dummy byte plus B1|B2|B3) was added:
 +<code>
 +# zdb -S INBOX
 +Simulated DDT histogram:
 +
 +bucket              allocated                       referenced
 +______   ______________________________   ______________________________
 +refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
 +------   ------   -----   -----   -----   ------   -----   -----   -----
 +          385   48.1M   48.1M   48.1M      385   48.1M   48.1M   48.1M
 +          384     48M     48M     48M    1.50K    192M    192M    192M
 + Total      769   96.1M   96.1M   96.1M    1.88K    240M    240M    240M
 +
 +dedup = 2.50, compress = 1.00, copies = 1.00, dedup * compress / copies = 2.50
 +</code>
 +
 +**So ZFS cannot match shifted data and make deduplication!**
 +
 +Additional simple test. Two files:
 +|0|B1|0|B2|0|B3|0|
 +|0|B1|B2|B3|
 +Only beginning of both files |0|B1| was deduplicated (16MB saved)
 +
 +ZFS provides block level deduplication based on block checksums which we got almost for free.
 +
 +
 +===== Compression =====
 +
 +Enable compression and dedupliaction in parent dataset (will be inherited by childs)
 +
 +<code>
 +zfs set compression=on INBOX
 +</code>
 +Possible parameters for compression=on | off | lzjb | gzip | gzip-[1-9] | zle
 +New attributed applies only to newly written data. For test data I was using Maildir with some huge e-mails.
 +
 +^compression ^ logical size  ^ physical size^ ratio | 
 +|off           702 MB      | 703 MB       | 1.0   |
 +|on = lzjb     702 MB      | 531 MB       | 1.32  |
 +|gzip-1      |   702 MB      | 374 MB       | 1.87  |
 +|gzip=gzip-6 |   702 MB      | 359 MB       | 1.95  |
 +|gzip-9      |   702 MB      | 353 MB       | 1.96  |
 +|--
 +|squashfs    |               | 365 MB             |
 +
 +
 +zdb -S INBOX
 +zdb -b INBOX
 +
 +
 +
 +<code bash>
 +zfs get compressratio
 +</code>
 +
 +===== References: =====
 +[[http://docs.oracle.com/cd/E19253-01/819-5461/6n7ht6qu6/index.html]]
 +[[https://wiki.freebsd.org/ZFSQuickStartGuide]]
 +
 +[[http://www.funtoo.org/ZFS_Fun|http://www.funtoo.org/ZFS_Fun]]
 +
 +[[http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe|http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe]]
 +
 +[[http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-113-size-zfs-dedup-1354231.html|http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-113-size-zfs-dedup-1354231.html]]