meta data for this page
Encrypted FS
Encrypted partition
apt-get install cryptsetup-bin
Enable HW acceleration. Which is a bit slower than software :P
NOTE: From Kernel 4.2 cesa driver was completely rewritten to support DMA, and old mv_cesa driver was removed in kernel 4.15
Kernel 3.18
modprobe mv_cesa cat /proc/crypto | grep mv_cesa -B 2 -A 7
Is providing only:
- hmac(sha1)
- sha1
- cbc(aes)
- ecb(aes)
There are also additional kernel modules optimised for ARM:
- sha1_arm
- aes_arm
Kernel 5.8
modprobe mv_cesa cat /proc/crypto | grep cesa -B 2 -A 7
Is providing:
- hmac(sha1)
- hmac(md5)
- sha1
- md5
- cbc(aes)
- ecb(aes)
- cbc(des3_ede)
- ecb(des3_ede)
- cbc(des)
- ecb(des)
There are also additional kernel modules optimised for ARM:
- sha1_arm
- aes_arm
fio benchmark
CESA 4kB:
READ: bw=479KiB/s (491kB/s), 479KiB/s-479KiB/s (491kB/s-491kB/s), io=384MiB (403MB), run=820117-820117msec WRITE: bw=160KiB/s (164kB/s), 160KiB/s-160KiB/s (164kB/s-164kB/s), io=128MiB (134MB), run=820117-820117msec
CESA 64kB:
READ: bw=5663KiB/s (5799kB/s), 5663KiB/s-5663KiB/s (5799kB/s-5799kB/s), io=382MiB (400MB), run=69037-69037msec WRITE: bw=1931KiB/s (1977kB/s), 1931KiB/s-1931KiB/s (1977kB/s-1977kB/s), io=130MiB (137MB), run=69037-69037msec
ARM 4kB:
READ: bw=478KiB/s (490kB/s), 478KiB/s-478KiB/s (490kB/s-490kB/s), io=384MiB (403MB), run=822072-822072msec WRITE: bw=159KiB/s (163kB/s), 159KiB/s-159KiB/s (163kB/s-163kB/s), io=128MiB (134MB), run=822072-822072msec
ARM 64kB:
READ: bw=5622KiB/s (5757kB/s), 5622KiB/s-5622KiB/s (5757kB/s-5757kB/s), io=382MiB (400MB), run=69540-69540msec WRITE: bw=1917KiB/s (1963kB/s), 1917KiB/s-1917KiB/s (1963kB/s-1963kB/s), io=130MiB (137MB), run=69540-69540msec
Benchmark
cryptsetup benchmark
Algorithm | Key | Encryption | Decryption | accel | kernel |
---|---|---|---|---|---|
aes-cbc | 128b | 12.8 MiB/s | 13.4 MiB/s | 3.18 | |
13.4 MiB/s | 14.1 MiB/s | arm | 3.18 | ||
19.7 MiB/s | 20.2 MiB/s | mv_cesa | 3.18 | ||
34,9 MiB/s | 36.2 MiB/s | marvell_cesa | 5.8 | ||
serpent-cbc | 128b | 11.1 MiB/s | 11.6 MiB/s | 3.18 | |
twofish-cbc | 128b | 13.0 MiB/s | 13.4 MiB/s | 3.18 | |
aes-cbc | 256b | 10.1 MiB/s | 10.5 MiB/s | 3.18 | |
11.0 MiB/s | 11.4 MiB/s | arm | 3.18 | ||
18.9 MiB/s | 19.2 MiB/s | mv_cesa | 3.18 | ||
32.0 MiB/s | 33.1 MiB/s | marvell_cesa | 5.8 | ||
serpent-cbc | 256b | 11.1 MiB/s | 11.6 MiB/s | 3.18 | |
twofish-cbc | 256b | 13.0 MiB/s | 13.4 MiB/s | 3.18 | |
aes-xts | 256b | 13.1 MiB/s | 13.3 MiB/s | 3.18 | |
14.6 MiB/s | 14.7 MiB/s | arm | 3.18 | ||
23.6 MiB/s | 22.5 MiB/s | marvell_cesa | 5.8 | ||
serpent-xts | 256b | 11.5 MiB/s | 11.6 MiB/s | 3.18 | |
twofish-xts | 256b | 13.4 MiB/s | 13.2 MiB/s | 3.18 | |
aes-xts | 512b | 10.2 MiB/s | 10.4 MiB/s | 3.18 | |
11.4 MiB/s | 11.8 MiB/s | arm | 3.18 | ||
22.5 MiB/s | 23.1 MiB/s | marvell_cesa | 5.8 | ||
serpent-xts | 512b | 11.5 MiB/s | 11.6 MiB/s | 3.18 | |
twofish-xts | 512b | 13.4 MiB/s | 13.2 MiB/s | 3.18 |
Ciphers benchmark
Each cipher was tested with following steps:
- luksFormat /dev/sda5
- luksOpen /dev/sda5 sda5
- benchmarks described in table below on /dev/mapper/sda5
- create ext4fs on /dev/mapper/sda5
- the same benchmarks but on mounted ext4 (writing/reading from file).
test | command line | description |
---|---|---|
hdparm | hdparm -t /dev/… | Buffered read test |
WR | dd bs=16M count=128 | Normal buffered transfer, but with sync before exit |
WR S | ||
WR DS | ||
RD |
REMARKS:
- For XTS, only half of key is used, so for 128b cipher I need to specify -s 256.
- Ext4 by default was created with lazy_init, to speed up creation process, but it can make impact on tests.
- Before each test, flush by sync && echo 3 > …/drop_caches was issued.
128b key | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Block device | EXT4 | ||||||||||||
acc | hdparm | WR | WR S | WR DS | RD | WR | WR S | WR DS | RD | ||||
cbc-plain | HW | 8.82 | 7.0 | 6.1 | 7.3 | 9.2 | 8.0 | 5.5 | 5.8 | 9.3 | |||
SW | 11.80 | 8.2 | 7.4 | 8.7 | 12.40 | 9.5 | 6.2 | 6.4 | 12.40 | ||||
ARM | 12.76 | 8.9 | 7.2 | 9.2 | 13.60 | 10.2 | 6.4 | 6.4 | 13.60 | * | |||
cbc-plain64 | HW | 8.79 | 6.9 | 6.1 | 7.5 | 9.2 | 7.9 | 5.3 | 5.6 | 9.0 | |||
SW | 11.83 | 8.2 | 7.4 | 9.2 | 12.40 | 9.5 | 6.2 | 6.6 | 12.40 | ||||
ARM | 12.73 | 8.9 | 7.2 | 9.3 | 13.60 | 10.2 | 6.2 | 6.1 | 13.60 | * | |||
cbc-essiv :sha256 | HW | 7.7 | 6.2 | 5.5 | 6.9 | 8.1 | 7.2 | 5.2 | 5.2 | 8.1 | |||
SW | 9.7 | 7.8 | 6.9 | 8.7 | 11.40 | 9.1 | 6.2 | 6.5 | 11.40 | ||||
ARM | 12.36 | 8.7 | 7.0 | 9.1 | 13.20 | 9.9 | 6.3 | 6.2 | 13.20 | * | |||
xts-plain | SW | 11.29 | 8.2 | 7.4 | 8.7 | 11.80 | 9.5 | 6.1 | 6.5 | 11.90 | |||
ARM | 12.79 | 9.3 | 7.5 | 10.1 | 13.60 | 10.6 | 6.3 | 5.9 | 13.70 | * | |||
xts-plain64 | SW | 11.27 | 8.2 | 7.4 | 8.7 | 11.80 | 9.5 | 6.2 | 6.5 | 11.70 | |||
ARM | 12.84 | 9.3 | 7.5 | 10.2 | 13.70 | 10.6 | 6.4 | 6.1 | 13.70 | * | |||
xts-essiv :sha256 | SW | 10.30 | 7.9 | 7.2 | 8.7 | 11.10 | 9.1 | 6.1 | 6.5 | 11.10 | |||
ARM | 12.40 | 9.1 | 7.5 | 9.3 | 13.20 | 10.4 | 6.3 | 6.1 | 13.30 | * | |||
256b key | |||||||||||||
Block device | EXT4 | ||||||||||||
acc | hdparm | WR | WR S | WR DS | RD | WR | WR S | WR DS | RD | ||||
cbc-plain | HW | 8.43 | 6.7 | 6.1 | 7.5 | 8.9 | 7.7 | 5.5 | 5.7 | 8.9 | |||
SW | 9.17 | 6.7 | 6.1 | 7.4 | 9.6 | 7.7 | 5.5 | 5.8 | 9.6 | ||||
ARM | 10.32 | 7.6 | 6.3 | 7.9 | 10.80 | 8.5 | 5.5 | 6.0 | 10.80 | * | |||
cbc-plain64 | HW | 8.44 | 6.7 | 6.1 | 7.5 | 8.9 | 7.7 | 5.5 | 5.7 | 8.8 | |||
SW | 9.15 | 6.8 | 6.1 | 7.5 | 9.5 | 7.6 | 5.5 | 5.8 | 9.7 | ||||
ARM | 10.24 | 7.6 | 6.2 | 7.8 | 10.70 | 8.4 | 5.1 | 5.5 | 10.00 | * | |||
cbc-essiv :sha256 | HW | 7.47 | 6.0 | 5.5 | 6.5 | 7.8 | 6.9 | 5.0 | 5.2 | 7.8 | |||
SW | 8.59 | 6.7 | 6.1 | 7.5 | 9.0 | 7.5 | 5.3 | 5.5 | 8.9 | ||||
ARM | 9.83 | 7.5 | 6.2 | 7.9 | 10.50 | 8.3 | 5.5 | 5.7 | 10.60 | * | |||
xts-plain | SW | 8.70 | 6.8 | 6.1 | 7.5 | 9.1 | 7.6 | 5.5 | 5.6 | 9.2 | |||
ARM | 10.09 | 7.9 | 6.6 | 8.5 | 10.7 | 8.8 | 5.2 | 5.6 | 10.80 | * | |||
xts-plain64 | SW | 8.70 | 6.8 | 6.1 | 7.5 | 9.2 | 7.6 | 5.5 | 5.6 | 9.2 | |||
ARM | 10.14 | 7.9 | 6.6 | 8.4 | 10.80 | 8.8 | 5.4 | 5.7 | 10.80 | * | |||
xts-essiv :sha256 | SW | 8.37 | 6.7 | 6.1 | 7.0 | 8.8 | 7.3 | 5.1 | 5.4 | 8.4 | |||
ARM | 9.94 | 7.7 | 6.3 | 7.9 | 10.40 | 8.5 | 4.9 | 5.2 | 9.7 | ||||
without encryption | |||||||||||||
Block device | EXT4 | ||||||||||||
acc | hdparm | WR | WR S | WR DS | RD | WR | WR S | WR DS | RD | ||||
/dev/sda5 | 137 | 91 | 33.7 | 51.7 | 149 | 69 | 13 | 15 | 149 |
file copy benchmark
Copy using dd if=src_file of=dst_file conv=fsync
“It will synchronize output data and metadata just before finishing”
128b key | ||||
---|---|---|---|---|
acc | WR S | RD | ||
aes-cbc-plain64 | HW | 5.8 | 8.1 | |
SW | 6.4 | 10.60 | ||
ARM | 6.8 | 12.00 | * | |
twofish-cbc-plain64 | SW | 6.5 | 10.60 | |
aes-cbc-essiv:sha256 | HW | 5.4 | 7.1 | |
SW | 6.3 | 10.30 | ||
ARM | 6.6 | 11.10 | ||
twofish-cbc-essiv:sha256 | SW | 6.5 | 10.70 | |
aes-xts-plain64 | SW | 6.4 | 10.20 | |
ARM | 7.0 | 12.10 | * | |
twofish-xts-plain64 | SW | 6.6 | 11.00 | |
twofish-xts-essiv:sha256 | SW | 6.4 | 10.50 | |
256b key | ||||
acc | WR S | RD | ||
aes-cbc-plain64 | HW | 5.8 | 8.3 | |
SW | 5.5 | 8.4 | ||
ARM | 5.9 | 9.5 | * | |
twofish-cbc-plain64 | SW | 6.6 | 11.00 | * |
aes-cbc-essiv:sha256 | HW | 5.5 | 7.3 | |
SW | 5.4 | 8.0 | ||
ARM | 5.9 | 9.6 | * | |
twofish-cbc-essiv:sha256 | SW | 6.5 | 10.70 | * |
aes-xts-plain64 | SW | 5.5 | 8.2 | |
ARM | 6.1 | 9.4 | * | |
twofish-xts-plain64 | SW | 6.6 | 10.90 | * |
twofish-xts-essiv:sha256 | SW | 6.3 | 10.10 | * |
loaded CPU benchmark
Comparison SW & HW with loaded system
stress -v -c 1
Block device | EXT4 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acc | hdparm | WR | WR S | WR DS | RD | WR | WR S | WR DS | RD | ||||
cbc-plain-128 | HW | 4.71 | 3.9 | 3.6 | 3.8 | 4.9 | 4.1 | 3.2 | 3.4 | 4.9 | |||
SW | 6.13 | 4.4 | 3.9 | 5.3 | 6.5 | 5.0 | 4.0 | 4.0 | 6.4 | ||||
ARM | 6.64 | 4.8 | 4.2 | 5.4 | 7.0 | 5.3 | 4.0 | 4.2 | 7.0 | * | |||
cbc-plain-256 | HW | 4.68 | 3.8 | 3.4 | 3.9 | 4.9 | 4.0 | 3.2 | 3.2 | 4.9 | |||
SW | 4.73 | 3.6 | 3.4 | 4.0 | 5.0 | 4.0 | 3.2 | 3.3 | 5.0 | ||||
ARM | 5.31 | 4.1 | 3.6 | 4.4 | 5.6 | 4.4 | 3.4 | 3.6 | 5.6 |
Twofish cipher
(SW only)
Block device | EXT4 | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
key | hdparm | WR | WR S | WR DS | RD | WR | WR S | WR DS | RD | |||
cbc-plain | 128 | 11.80 | 8.4 | 7.4 | 9.5 | 12.4 | 9.6 | 6.0 | 6.1 | 11.5 | ||
cbc-essiv:sha256 | 128 | 11.35 | 8.2 | 7.4 | 8.7 | 11.9 | 9.5 | 6.2 | 6.5 | 11.9 | ||
xts-plain | 128 | 11.61 | 8.4 | 7.4 | 9.4 | 12.2 | 9.5 | 6.2 | 6.6 | 12.3 | ||
xts-essiv:sha256 | 128 | 11.06 | 8.0 | 7.4 | 8.7 | 11.6 | 9.1 | 6.2 | 6.5 | 11.7 | ||
cbc-plain | 256 | 11.82 | 8.4 | 7.4 | 9.5 | 12.4 | 9.7 | 6.5 | 6.6 | 12.4 | ||
cbc-essiv:sha256 | 256 | 11.34 | 8.2 | 7.4 | 8.7 | 11.9 | 9.5 | 6.2 | 6.6 | 12.0 | ||
xts-plain | 256 | 11.64 | 8.4 | 7.4 | 9.4 | 12.2 | 9.6 | 6.2 | 6.6 | 12.3 | ||
xts-essiv:sha256 | 256 | 11.04 | 8.0 | 7.4 | 8.7 | 11.6 | 9.3 | 6.2 | 6.5 | 11.7 |
SSH performance
Enable low complexity ciphers if device is used locally.
ssh -Q cipher localhost | paste -d , -s
- /etc/ssh/sshd_config
# enable all ciphers! # obtained with ssh -Q cipher localhost | paste -d , -s Ciphers 3des-cbc,blowfish-cbc,cast128-cbc,arcfour,arcfour128,arcfour256,aes128-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com
cmd | performance | time | Kernel 5.8 |
---|---|---|---|
(default) | 3.1MB/s | ||
3des-cbc | 1.2MB/s | 1m28 | 67.9 MB/s |
blowfish-cbc | 3.3MB/s | 0m30 | 245.7 MB/s |
cast128-cbc | 2.9MB/s | 0m34 | 248.8 MB/s |
arcfour | 4.2MB/s | 0m24 | 425.5 MB/s |
arcfour128 | -- | -- | 395.3 MB/s | |
arcfour256 | 4.6MB/s | 0m22 | 425.5 MB/s |
aes128-cbc | 2.8MB/s | 0m37 | 228.8 MB/s |
aes192-cbc | 2.9MB/s | 0m34 | 211.4 MB/s |
aes256-cbc | 2.5MB/s | 0m40 | 192.3 MB/s |
rijndael-cbc@lysator.liu.se | 2.8MB/s | 0m36 | 192.3 MB/s |
aes128-ctr | 2.9MB/s | 0m35 | 223.2 MB/s |
aes192-ctr | 2.9MB/s | 0m35 | 202.8 MB/s |
aes256-ctr | 2.9MB/s | 0m40 | 191.6 MB/s |
aes128-gcm@openssh.com | 2.6MB/s | 0m39 | 170.7 MB/s |
aes256-gcm@openssh.com | 2.2MB/s | 0m47 | 151.7 MB/s |
chacha20-poly1305@openssh.com | 3.2MB/s | 0m32 | 268.8 MB/s |
fsck performance
- Kernel 5.8 CESA: user 1m32,738s sys 0m9,904s
- Kernel 5.8 ARM: user 1m32,642s sys 0m9,177s
Rsync with SSH tunnell performance
When rsyncing huge file, to use delta transfers, rsync needs to scan file to make checksums.
Kernel 5.8 ARM
134,113,800 1% 7.99MB/s 0:16:11 142,909,200 1% 8.04MB/s 0:16:03 151,768,800 1% 8.33MB/s 0:15:29 159,151,800 1% 8.11MB/s 0:15:53 175,715,400 2% 7.74MB/s 0:16:36 192,664,200 2% 7.86MB/s 0:16:20 200,111,400 2% 7.61MB/s 0:16:50 217,060,200 2% 7.63MB/s 0:16:45 234,715,200 2% 8.19MB/s 0:15:34 243,382,200 3% 8.17MB/s 0:15:37 250,572,600 3% 7.94MB/s 0:16:02 259,432,200 3% 7.98MB/s 0:15:57 268,227,600 3% 7.98MB/s 0:15:55 334,674,600 4% 8.19MB/s 0:15:24 %Cpu(s): 50,0 us, 50,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32375 root 20 0 0 0 0 R 63,2 0,0 1:14.38 kworker/u2:3-kcryptd/253:0 16281 root 20 0 14808 6600 3028 R 36,8 2,7 0:04.58 rsync %Cpu(s): 30,8 us, 65,4 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 3,8 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16360 root 20 0 0 0 0 R 57,1 0,0 0:00.35 kworker/u2:1+kcryptd/253:0 16281 root 20 0 14808 6604 3028 D 28,6 2,7 0:06.54 rsync %Cpu(s): 29,0 us, 71,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16360 root 20 0 0 0 0 R 58,3 0,0 0:01.74 kworker/u2:1+kcryptd/253:0 16281 root 20 0 14808 6604 3028 R 29,2 2,7 0:07.24 rsync %Cpu(s): 26,9 us, 73,1 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31416 root 20 0 0 0 0 R 50,0 0,0 0:06.85 kworker/u2:2+kcryptd/253:0 16281 root 20 0 14808 6608 3028 D 27,3 2,7 0:08.21 rsync %Cpu(s): 33,3 us, 66,7 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16360 root 20 0 0 0 0 R 63,6 0,0 0:08.34 kworker/u2:1+kcryptd/253:0 16281 root 20 0 14808 6612 3028 R 27,3 2,7 0:10.61 rsync
Kernel 5.8 CESA
178,540,200 2% 7.46MB/s 0:17:15 185,987,400 2% 7.35MB/s 0:17:29 209,613,000 2% 7.39MB/s 0:17:20 271,758,600 3% 7.30MB/s 0:17:24 287,680,200 3% 7.31MB/s 0:17:20 334,995,600 4% 7.30MB/s 0:17:16 358,300,200 4% 7.41MB/s 0:16:57 761,026,800 9% 7.07MB/s 0:16:51 769,051,800 9% 7.15MB/s 0:16:38 777,397,800 9% 7.40MB/s 0:16:03 %Cpu(s): 31,3 us, 68,1 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,6 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29705 root 20 0 0 0 0 R 34,4 0,0 1:40.62 kworker/u2:0+kcryptd/253:0 30062 root -51 0 0 0 0 S 32,5 0,0 1:10.38 irq/29-f1030000 32091 root 20 0 14808 6516 2908 D 28,7 2,7 0:24.76 rsync %Cpu(s): 36,4 us, 63,6 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29705 root 20 0 0 0 0 R 28,0 0,0 1:41.40 kworker/u2:0+kcryptd/253:0 30062 root -51 0 0 0 0 S 28,0 0,0 1:11.12 irq/29-f1030000 32091 root 20 0 14808 6516 2908 D 24,0 2,7 0:25.41 rsync %Cpu(s): 26,5 us, 73,5 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29705 root 20 0 0 0 0 R 30,8 0,0 1:42.31 kworker/u2:0+kcryptd/253:0 30062 root -51 0 0 0 0 S 26,9 0,0 1:11.98 irq/29-f1030000 32091 root 20 0 14808 6520 2908 D 19,2 2,7 0:26.17 rsync %Cpu(s): 29,0 us, 71,0 sy, 0,0 ni, 0,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29705 root 20 0 0 0 0 R 25,9 0,0 1:42.76 kworker/u2:0+kcryptd/253:0 30062 root -51 0 0 0 0 S 22,2 0,0 1:12.41 irq/29-f1030000 32091 root 20 0 14808 6520 2908 D 18,5 2,7 0:26.54 rsync