So I follow ZFS development quite closely and understand that the ZVOL code in ZFS isn't optimal and need quite a bit of reworking for performance (no one is sponsoring this currently) which made me question why Proxmox chose ZVOLs over QCOW2 (Note QCOW2 isn't COW on COW, the file format just has the ability to do COW given a template). The current Proxmox code for creating QCOW2 files isn't optimal so I had to edit a few files to add `extended_l2=on` and `cluster_size=128k` and finally `l2-cache-size=64M` (l2-cache-size shouldn't matter due to disk size) due to extended_l2 doubling ram requirements.
The VM WITH QCOW2 BACKED STORAGE:
The VM with ZVOL BACKED STORAGE:
Let me know if more info is required or if something is obviously wrong, setup was defaults except q35 machine and 4cpus for both vms.
The VM WITH QCOW2 BACKED STORAGE:
Code:
randrw: (g=0): rw=randrw, bs=(R) 4096B-128KiB, (W) 4096B-128KiB, (T) 4096B-128KiB, ioengine=psync, iodepth=1
...
fio-3.25
Starting 4 processes
randrw: (groupid=0, jobs=4): err= 0: pid=1736: Sat Dec 25 23:13:16 2021
read: IOPS=5101, BW=242MiB/s (254MB/s)(85.2GiB/360006msec)
clat (nsec): min=661, max=108429k, avg=598680.35, stdev=1344388.05
lat (nsec): min=681, max=108429k, avg=598946.47, stdev=1344899.03
clat percentiles (usec):
| 1.00th=[ 15], 5.00th=[ 77], 10.00th=[ 92], 20.00th=[ 114],
| 30.00th=[ 133], 40.00th=[ 155], 50.00th=[ 182], 60.00th=[ 221],
| 70.00th=[ 314], 80.00th=[ 824], 90.00th=[ 1385], 95.00th=[ 2311],
| 99.00th=[ 6194], 99.50th=[ 8848], 99.90th=[15795], 99.95th=[19006],
| 99.99th=[27132]
bw ( KiB/s): min=19352, max=568085, per=100.00%, avg=249212.36, stdev=15444.33, samples=2852
iops : min= 296, max= 9405, avg=5119.07, stdev=307.22, samples=2852
write: IOPS=5101, BW=242MiB/s (254MB/s)(85.1GiB/360006msec); 0 zone resets
clat (nsec): min=972, max=107342k, avg=168551.40, stdev=842880.09
lat (nsec): min=1032, max=107819k, avg=170752.52, stdev=848276.64
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 5], 10.00th=[ 7], 20.00th=[ 10],
| 30.00th=[ 14], 40.00th=[ 19], 50.00th=[ 25], 60.00th=[ 33],
| 70.00th=[ 45], 80.00th=[ 73], 90.00th=[ 169], 95.00th=[ 578],
| 99.00th=[ 3458], 99.50th=[ 5014], 99.90th=[10159], 99.95th=[13435],
| 99.99th=[25822]
bw ( KiB/s): min=18432, max=600231, per=100.00%, avg=248895.02, stdev=15653.44, samples=2852
iops : min= 282, max= 9488, avg=5118.89, stdev=310.84, samples=2852
lat (nsec) : 750=0.01%, 1000=0.01%
lat (usec) : 2=0.09%, 4=2.02%, 10=8.82%, 20=10.89%, 50=15.41%
lat (usec) : 100=12.01%, 250=29.20%, 500=6.71%, 750=2.19%, 1000=2.23%
lat (msec) : 2=6.37%, 4=2.58%, 10=1.23%, 20=0.21%, 50=0.03%
lat (msec) : 100=0.01%, 250=0.01%
cpu : usr=2.91%, sys=42.12%, ctx=1864334, majf=0, minf=84
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=1836467,1836399,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=242MiB/s (254MB/s), 242MiB/s-242MiB/s (254MB/s-254MB/s), io=85.2GiB (91.5GB), run=360006-360006msec
WRITE: bw=242MiB/s (254MB/s), 242MiB/s-242MiB/s (254MB/s-254MB/s), io=85.1GiB (91.4GB), run=360006-360006msec
Disk stats (read/write):
sda: ios=1829214/1803227, merge=0/20385739, ticks=667640/3863084, in_queue=4530725, util=98.23%
The VM with ZVOL BACKED STORAGE:
Code:
randrw: (g=0): rw=randrw, bs=(R) 4096B-128KiB, (W) 4096B-128KiB, (T) 4096B-128KiB, ioengine=psync, iodepth=1
...
fio-3.25
Starting 4 processes
randrw: (groupid=0, jobs=4): err= 0: pid=1737: Sat Dec 25 22:58:57 2021
read: IOPS=2216, BW=115MiB/s (121MB/s)(40.4GiB/360001msec)
clat (nsec): min=1283, max=57840k, avg=1349180.85, stdev=1969173.27
lat (nsec): min=1343, max=57840k, avg=1349616.59, stdev=1969523.11
clat percentiles (usec):
| 1.00th=[ 63], 5.00th=[ 190], 10.00th=[ 225], 20.00th=[ 289],
| 30.00th=[ 388], 40.00th=[ 537], 50.00th=[ 709], 60.00th=[ 930],
| 70.00th=[ 1254], 80.00th=[ 1827], 90.00th=[ 3163], 95.00th=[ 4752],
| 99.00th=[ 9503], 99.50th=[12256], 99.90th=[20055], 99.95th=[24249],
| 99.99th=[33817]
bw ( KiB/s): min=48881, max=434584, per=100.00%, avg=117885.82, stdev=9920.76, samples=2860
iops : min= 1084, max= 6574, avg=2216.60, stdev=131.79, samples=2860
write: IOPS=2221, BW=115MiB/s (121MB/s)(40.4GiB/360001msec); 0 zone resets
clat (nsec): min=1453, max=44148k, avg=382103.17, stdev=1064365.83
lat (nsec): min=1493, max=44148k, avg=391463.47, stdev=1077514.67
clat percentiles (usec):
| 1.00th=[ 6], 5.00th=[ 19], 10.00th=[ 29], 20.00th=[ 48],
| 30.00th=[ 67], 40.00th=[ 88], 50.00th=[ 114], 60.00th=[ 147],
| 70.00th=[ 192], 80.00th=[ 273], 90.00th=[ 668], 95.00th=[ 1876],
| 99.00th=[ 5276], 99.50th=[ 6783], 99.90th=[11994], 99.95th=[14615],
| 99.99th=[22152]
bw ( KiB/s): min=41281, max=453336, per=100.00%, avg=117862.22, stdev=10154.48, samples=2860
iops : min= 929, max= 6846, avg=2221.45, stdev=137.61, samples=2860
lat (usec) : 2=0.01%, 4=0.27%, 10=0.77%, 20=1.96%, 50=8.16%
lat (usec) : 100=12.01%, 250=22.88%, 500=16.98%, 750=8.27%, 1000=6.01%
lat (msec) : 2=11.29%, 4=7.12%, 10=3.74%, 20=0.47%, 50=0.06%
lat (msec) : 100=0.01%
cpu : usr=7.48%, sys=44.27%, ctx=837696, majf=0, minf=78
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=797856,799628,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=115MiB/s (121MB/s), 115MiB/s-115MiB/s (121MB/s-121MB/s), io=40.4GiB (43.4GB), run=360001-360001msec
WRITE: bw=115MiB/s (121MB/s), 115MiB/s-115MiB/s (121MB/s-121MB/s), io=40.4GiB (43.4GB), run=360001-360001msec
Disk stats (read/write):
sda: ios=796181/792213, merge=0/9732409, ticks=563273/2183634, in_queue=2746908, util=98.76%
Let me know if more info is required or if something is obviously wrong, setup was defaults except q35 machine and 4cpus for both vms.
Last edited: