Hello,
a few days ago I already explained my problem in the german forum, but I can't get any further, maybe someone can help me here.
The overall write performance is horrible. vzdump of 22Gb LXC container ~82Mb/s, same results when i upload files to file-server container.
Setup:
Xeon E5-2620 V3
32Gb Ram
2x 960Gb Samsung SM863
Mirror
compression=on
ashift=12
Proxmox installed with root on ZFS raid1, standard procedure.
For comparison i have tested with an identical machine, with root on ext4 and lvm mdadm raid1... vzdump the same container ~220MB/s. Another Machine with root on zfs, but with consumer hardware (i7-3770, 8Gb RAM, 2x 256 Samsung 850 Pro), has the same results with fio in read and vzdump 22Gb container ~110Mb/s. Other 2-Node-Cluster with storage Replication and root on ext4, container on separate zfs mirror has ~170Gb/s.
I cant understand that.
My theory: Wrong ashift or hardware to bad. I have no Ideas anymore.
Whats wrong?
a few days ago I already explained my problem in the german forum, but I can't get any further, maybe someone can help me here.
The overall write performance is horrible. vzdump of 22Gb LXC container ~82Mb/s, same results when i upload files to file-server container.
Setup:
Xeon E5-2620 V3
32Gb Ram
2x 960Gb Samsung SM863
Mirror
compression=on
ashift=12
Proxmox installed with root on ZFS raid1, standard procedure.
io --filename=/rpool/data/test/testus --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=120 --size=1G --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12
Starting 1 process
journal-test: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=9.84MiB/s][w=2518 IOPS][eta 00m:00s]
journal-test: (groupid=0, jobs=1): err= 0: pid=25958: Thu Sep 5 13:55:14 2019
write: IOPS=2563, BW=10.0MiB/s (10.5MB/s)(1202MiB/120001msec); 0 zone resets
clat (usec): min=235, max=20752, avg=388.32, stdev=168.71
lat (usec): min=235, max=20752, avg=388.58, stdev=168.74
clat percentiles (usec):
| 1.00th=[ 314], 5.00th=[ 330], 10.00th=[ 334], 20.00th=[ 343],
| 30.00th=[ 355], 40.00th=[ 363], 50.00th=[ 375], 60.00th=[ 383],
| 70.00th=[ 396], 80.00th=[ 416], 90.00th=[ 474], 95.00th=[ 502],
| 99.00th=[ 545], 99.50th=[ 553], 99.90th=[ 586], 99.95th=[ 709],
| 99.99th=[ 2409]
bw ( KiB/s): min= 8528, max=11992, per=99.98%, avg=10252.11, stdev=644.11, samples=239
iops : min= 2132, max= 2998, avg=2563.01, stdev=161.03, samples=239
lat (usec) : 250=0.01%, 500=94.61%, 750=5.33%, 1000=0.02%
lat (msec) : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
cpu : usr=1.02%, sys=9.37%, ctx=615314, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,307641,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=10.0MiB/s (10.5MB/s), 10.0MiB/s-10.0MiB/s (10.5MB/s-10.5MB/s), io=1202MiB (1260MB), run=120001-120001msec
journal-test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12
Starting 1 process
journal-test: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=9.84MiB/s][w=2518 IOPS][eta 00m:00s]
journal-test: (groupid=0, jobs=1): err= 0: pid=25958: Thu Sep 5 13:55:14 2019
write: IOPS=2563, BW=10.0MiB/s (10.5MB/s)(1202MiB/120001msec); 0 zone resets
clat (usec): min=235, max=20752, avg=388.32, stdev=168.71
lat (usec): min=235, max=20752, avg=388.58, stdev=168.74
clat percentiles (usec):
| 1.00th=[ 314], 5.00th=[ 330], 10.00th=[ 334], 20.00th=[ 343],
| 30.00th=[ 355], 40.00th=[ 363], 50.00th=[ 375], 60.00th=[ 383],
| 70.00th=[ 396], 80.00th=[ 416], 90.00th=[ 474], 95.00th=[ 502],
| 99.00th=[ 545], 99.50th=[ 553], 99.90th=[ 586], 99.95th=[ 709],
| 99.99th=[ 2409]
bw ( KiB/s): min= 8528, max=11992, per=99.98%, avg=10252.11, stdev=644.11, samples=239
iops : min= 2132, max= 2998, avg=2563.01, stdev=161.03, samples=239
lat (usec) : 250=0.01%, 500=94.61%, 750=5.33%, 1000=0.02%
lat (msec) : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
cpu : usr=1.02%, sys=9.37%, ctx=615314, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,307641,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=10.0MiB/s (10.5MB/s), 10.0MiB/s-10.0MiB/s (10.5MB/s-10.5MB/s), io=1202MiB (1260MB), run=120001-120001msec
fio --filename=/rpool/data/test/testus --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=1 --runtime=120 --size=1G --time_based --group_reporting --name=journal-test
journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=1257MiB/s][r=322k IOPS][eta 00m:00s]
journal-test: (groupid=0, jobs=1): err= 0: pid=9803: Thu Sep 5 13:58:26 2019
read: IOPS=317k, BW=1238MiB/s (1298MB/s)(145GiB/120001msec)
clat (nsec): min=1990, max=688829, avg=2932.38, stdev=4310.83
lat (usec): min=2, max=688, avg= 2.96, stdev= 4.31
clat percentiles (nsec):
| 1.00th=[ 2064], 5.00th=[ 2064], 10.00th=[ 2064], 20.00th=[ 2064],
| 30.00th=[ 2096], 40.00th=[ 2096], 50.00th=[ 2096], 60.00th=[ 2096],
| 70.00th=[ 2128], 80.00th=[ 2288], 90.00th=[ 2416], 95.00th=[ 2576],
| 99.00th=[26496], 99.50th=[27520], 99.90th=[30080], 99.95th=[32384],
| 99.99th=[46336]
bw ( MiB/s): min= 1081, max= 1303, per=99.98%, avg=1237.86, stdev=23.00, samples=239
iops : min=276812, max=333644, avg=316892.29, stdev=5886.78, samples=239
lat (usec) : 2=0.01%, 4=96.67%, 10=0.14%, 20=0.09%, 50=3.09%
lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
cpu : usr=20.01%, sys=79.97%, ctx=848, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=38034713,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=1238MiB/s (1298MB/s), 1238MiB/s-1238MiB/s (1298MB/s-1298MB/s), io=145GiB (156GB), run=120001-120001msec
journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=1257MiB/s][r=322k IOPS][eta 00m:00s]
journal-test: (groupid=0, jobs=1): err= 0: pid=9803: Thu Sep 5 13:58:26 2019
read: IOPS=317k, BW=1238MiB/s (1298MB/s)(145GiB/120001msec)
clat (nsec): min=1990, max=688829, avg=2932.38, stdev=4310.83
lat (usec): min=2, max=688, avg= 2.96, stdev= 4.31
clat percentiles (nsec):
| 1.00th=[ 2064], 5.00th=[ 2064], 10.00th=[ 2064], 20.00th=[ 2064],
| 30.00th=[ 2096], 40.00th=[ 2096], 50.00th=[ 2096], 60.00th=[ 2096],
| 70.00th=[ 2128], 80.00th=[ 2288], 90.00th=[ 2416], 95.00th=[ 2576],
| 99.00th=[26496], 99.50th=[27520], 99.90th=[30080], 99.95th=[32384],
| 99.99th=[46336]
bw ( MiB/s): min= 1081, max= 1303, per=99.98%, avg=1237.86, stdev=23.00, samples=239
iops : min=276812, max=333644, avg=316892.29, stdev=5886.78, samples=239
lat (usec) : 2=0.01%, 4=96.67%, 10=0.14%, 20=0.09%, 50=3.09%
lat (usec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
cpu : usr=20.01%, sys=79.97%, ctx=848, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=38034713,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=1238MiB/s (1298MB/s), 1238MiB/s-1238MiB/s (1298MB/s-1298MB/s), io=145GiB (156GB), run=120001-120001msec
fio --name=/rpool/data/test/randfile --ioengine=libaio --iodepth=32 --rw=randwrite --bs=4k --direct=1 --size=1G --numjobs=1 --group_reporting
/rpool/data/test/randfile: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.12
Starting 1 process
/rpool/data/test/randfile: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=143MiB/s][w=36.7k IOPS][eta 00m:00s]
/rpool/data/test/randfile: (groupid=0, jobs=1): err= 0: pid=28555: Thu Sep 5 14:03:01 2019
write: IOPS=49.3k, BW=193MiB/s (202MB/s)(1024MiB/5312msec); 0 zone resets
slat (usec): min=5, max=104127, avg=18.59, stdev=451.38
clat (nsec): min=1854, max=105420k, avg=628986.43, stdev=2591309.26
lat (usec): min=9, max=105432, avg=647.71, stdev=2634.70
clat percentiles (usec):
| 1.00th=[ 273], 5.00th=[ 285], 10.00th=[ 289], 20.00th=[ 302],
| 30.00th=[ 314], 40.00th=[ 334], 50.00th=[ 359], 60.00th=[ 408],
| 70.00th=[ 498], 80.00th=[ 685], 90.00th=[ 1012], 95.00th=[ 1369],
| 99.00th=[ 3326], 99.50th=[ 5211], 99.90th=[ 13042], 99.95th=[101188],
| 99.99th=[105382]
bw ( KiB/s): min=35216, max=317072, per=93.23%, avg=184042.40, stdev=83970.31, samples=10
iops : min= 8804, max=79268, avg=46010.40, stdev=20992.39, samples=10
lat (usec) : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.05%
lat (usec) : 500=70.08%, 750=12.31%, 1000=7.28%
lat (msec) : 2=8.17%, 4=1.33%, 10=0.62%, 20=0.09%, 250=0.06%
cpu : usr=8.92%, sys=67.78%, ctx=10759, majf=0, minf=10
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=193MiB/s (202MB/s), 193MiB/s-193MiB/s (202MB/s-202MB/s), io=1024MiB (1074MB), run=5312-5312msec
/rpool/data/test/randfile: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.12
Starting 1 process
/rpool/data/test/randfile: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=143MiB/s][w=36.7k IOPS][eta 00m:00s]
/rpool/data/test/randfile: (groupid=0, jobs=1): err= 0: pid=28555: Thu Sep 5 14:03:01 2019
write: IOPS=49.3k, BW=193MiB/s (202MB/s)(1024MiB/5312msec); 0 zone resets
slat (usec): min=5, max=104127, avg=18.59, stdev=451.38
clat (nsec): min=1854, max=105420k, avg=628986.43, stdev=2591309.26
lat (usec): min=9, max=105432, avg=647.71, stdev=2634.70
clat percentiles (usec):
| 1.00th=[ 273], 5.00th=[ 285], 10.00th=[ 289], 20.00th=[ 302],
| 30.00th=[ 314], 40.00th=[ 334], 50.00th=[ 359], 60.00th=[ 408],
| 70.00th=[ 498], 80.00th=[ 685], 90.00th=[ 1012], 95.00th=[ 1369],
| 99.00th=[ 3326], 99.50th=[ 5211], 99.90th=[ 13042], 99.95th=[101188],
| 99.99th=[105382]
bw ( KiB/s): min=35216, max=317072, per=93.23%, avg=184042.40, stdev=83970.31, samples=10
iops : min= 8804, max=79268, avg=46010.40, stdev=20992.39, samples=10
lat (usec) : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.05%
lat (usec) : 500=70.08%, 750=12.31%, 1000=7.28%
lat (msec) : 2=8.17%, 4=1.33%, 10=0.62%, 20=0.09%, 250=0.06%
cpu : usr=8.92%, sys=67.78%, ctx=10759, majf=0, minf=10
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=0,262144,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=193MiB/s (202MB/s), 193MiB/s-193MiB/s (202MB/s-202MB/s), io=1024MiB (1074MB), run=5312-5312msec
zpool status -v
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 00:00:03 with 0 errors on Tue Aug 20 13:47:23 2019
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-SAMSUNG_sda-part3 ONLINE 0 0 0
ata-SAMSUNG_sdb-part3 ONLINE 0 0 0
errors: No known data errors
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 00:00:03 with 0 errors on Tue Aug 20 13:47:23 2019
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-SAMSUNG_sda-part3 ONLINE 0 0 0
ata-SAMSUNG_sdb-part3 ONLINE 0 0 0
errors: No known data errors
pveperf
CPU BOGOMIPS: 57595.56
REGEX/SECOND: 3123298
HD SIZE: 760.00 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND: 2071.76
DNS EXT: 50.04 ms
DNS INT: 35.46 ms
CPU BOGOMIPS: 57595.56
REGEX/SECOND: 3123298
HD SIZE: 760.00 GB (rpool/ROOT/pve-1)
FSYNCS/SECOND: 2071.76
DNS EXT: 50.04 ms
DNS INT: 35.46 ms
pveversion -v
proxmox-ve: 6.0-2 (running kernel: 5.0.21-1-pve)
pve-manager: 6.0-6 (running version: 6.0-6/c71f879f)
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.11-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-7
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-64
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-5
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-2
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve2
proxmox-ve: 6.0-2 (running kernel: 5.0.21-1-pve)
pve-manager: 6.0-6 (running version: 6.0-6/c71f879f)
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-5.0.21-1-pve: 5.0.21-2
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1
corosync: 3.0.2-pve2
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.11-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-2
libpve-storage-perl: 6.0-7
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-64
lxcfs: 3.0.3-pve60
novnc-pve: 1.0.0-60
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-7
pve-cluster: 6.0-7
pve-container: 3.0-5
pve-docs: 6.0-4
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-2
pve-qemu-kvm: 4.0.0-5
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-7
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.1-pve2
For comparison i have tested with an identical machine, with root on ext4 and lvm mdadm raid1... vzdump the same container ~220MB/s. Another Machine with root on zfs, but with consumer hardware (i7-3770, 8Gb RAM, 2x 256 Samsung 850 Pro), has the same results with fio in read and vzdump 22Gb container ~110Mb/s. Other 2-Node-Cluster with storage Replication and root on ext4, container on separate zfs mirror has ~170Gb/s.
I cant understand that.
My theory: Wrong ashift or hardware to bad. I have no Ideas anymore.
Whats wrong?