meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
linux:fs:zfs:dedup [2021/02/15 10:05] – niziak | linux:fs:zfs:dedup [2021/03/14 22:44] – niziak | ||
---|---|---|---|
Line 2: | Line 2: | ||
For deduplication, | For deduplication, | ||
+ | * For every TB of pool data, you should expect 5 GB of dedup table data, assuming an average block size of 64K. | ||
+ | * This means you should plan for at least 20GB of system RAM per TB of pool data, if you want to keep the dedup table in RAM, plus any extra memory for other metadata, plus an extra GB for the OS. | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | |||
+ | ===== THINK TWICE ! ===== | ||
+ | |||
+ | Never ever turn on deduplication for whole pool. It is not possible to turn it off without sending whole pool to another zfs and receiving it back. | ||
+ | Also it is best to have plenty of RAM to fit all DDT into RAM, not SSD/NVMe. | ||
+ | |||
+ | Huge CPU usage by over 96 ZFS kernel threads noticed with open-zfs v8.0.6 (ZFS On Linux), when some big parts of data deleted (auto snapshot rotation). It is connected with deduplication enabled and causes system to almost freeze because of high CPU usage! | ||
+ | |||
+ | ==== WARNING! ==== | ||
+ | |||
+ | Issue when deleting large portion of data and deduplication enabled. | ||
+ | ZFS driver creates 96 '' | ||
+ | These threads kills CPU and IO. | ||
+ | |||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
===== Turn on ===== | ===== Turn on ===== | ||
Line 12: | Line 40: | ||
Once all deduped datasets are destroyed the dedup table will be removed and the performance impact is cleared. | Once all deduped datasets are destroyed the dedup table will be removed and the performance impact is cleared. | ||
+ | |||
===== status ===== | ===== status ===== | ||
Line 19: | Line 48: | ||
zfs get dedup | egrep ' | zfs get dedup | egrep ' | ||
</ | </ | ||
+ | |||
+ | <code bash> | ||
+ | zpool list rpool | ||
+ | |||
+ | NAME SIZE ALLOC | ||
+ | rpool | ||
+ | </ | ||
+ | |||
<code bash> | <code bash> | ||
zpool status -D rpool | zpool status -D rpool | ||
+ | zdb -DD rpool | ||
</ | </ | ||
< | < | ||
- | dedup: DDT entries | + | DDT-sha256-zap-duplicate: 550683 entries, size 474 on disk, 153 in core |
+ | DDT-sha256-zap-unique: | ||
+ | |||
+ | DDT histogram (aggregated over all DDTs): | ||
bucket | bucket | ||
Line 30: | Line 71: | ||
refcnt | refcnt | ||
------ | ------ | ||
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | 16 7,61K 144M 79,0M 88,9M | + | 16 7.61K 144M 79.0M 88.9M |
- | 32 564 8,93M 3,35M 4,02M 22,8K 382M 146M 174M | + | 32 564 8.93M 3.35M 4.02M 22.8K 382M 146M 174M |
- | 64 110 1,32M 588K 776K 9,46K 113M 51,6M 67,3M | + | 64 110 1.32M 588K 776K 9.46K 113M 51.6M 67.3M |
- | | + | |
- | | + | |
- | | + | |
- | 1K 6 43,5K | + | 1K 6 43.5K |
- | 2K 1 36,5K 8K 8K 2,08K 75,9M 16,6M 16,6M | + | 2K 1 36.5K 8K 8K 2.08K 75.9M 16.6M 16.6M |
- | | + | |
+ | |||
+ | dedup = 1.20, compress = 1.07, copies = 1.00, dedup * compress / copies = 1.27 | ||
</ | </ | ||
Where DDT table memory usage can be calculated: | Where DDT table memory usage can be calculated: | ||
- | * '' | + | * '' |
- | * '' | + | * '' |
- | <code bash> | + | SIZES: |
- | zdb -DD rpool | + | * DSIZE: (On Disk size) On pool there is 446GB of data stored on 373GB of disk (446 / 373 = 1,195 dedup ratio). |
- | </ | + | * LSIZE: (logical |
+ | * PSIZE: (physical size) size required to store all data and DSIZE | ||