Hello,
I currently have a configuration with a Proxmox-Server having a 2 SAS-SSDS (Samsung PM 1643 each up to 400k IOPS read - See here).
Sadly inside of Proxmox I experience a pretty bad performance in read of only 30k IOPS behind the HW-RAID (HPE P420) and inside of the VMs of only around 4k IOPS.
So probably there two problems, as i cant explain to myself, why "up to 400k IOPS" on 2 disks can only result in 30k IOPS behind a RAID 1, as read should basically use both disks for lookup. That write is slowed down with a RAID 1 is obviously, as only the worst performance of both disks defines the speed/IOPS.
But aside from the problem with the HW-RAID I still experience only 4k IOPS when checking with fio on the VM while the host-system reaches around 30k IOPS in avg around 27k IOPS on the host system. As my "sdb" should already be the LVM it must be anywhere between the proxmox-host and the connector for the VM. But I also checked all configuration and as far as I can see the connection of the disk into the vm seems fine (SCSI BUS + VirtIO SCSI-Controller, which was always named as the fastest option here in the forums).
Below are the fio-reports from host and vm:
host:
vm:
Does anyone of you maybe have a good idea what can cause so much performance-loss between host and vms? Of course there are multiple vms running, but there shouldn't be so much read/writes in the other vms, that those single vm gets so slow.
Kind regards,
Sebastian
I currently have a configuration with a Proxmox-Server having a 2 SAS-SSDS (Samsung PM 1643 each up to 400k IOPS read - See here).
Sadly inside of Proxmox I experience a pretty bad performance in read of only 30k IOPS behind the HW-RAID (HPE P420) and inside of the VMs of only around 4k IOPS.
So probably there two problems, as i cant explain to myself, why "up to 400k IOPS" on 2 disks can only result in 30k IOPS behind a RAID 1, as read should basically use both disks for lookup. That write is slowed down with a RAID 1 is obviously, as only the worst performance of both disks defines the speed/IOPS.
But aside from the problem with the HW-RAID I still experience only 4k IOPS when checking with fio on the VM while the host-system reaches around 30k IOPS in avg around 27k IOPS on the host system. As my "sdb" should already be the LVM it must be anywhere between the proxmox-host and the connector for the VM. But I also checked all configuration and as far as I can see the connection of the disk into the vm seems fine (SCSI BUS + VirtIO SCSI-Controller, which was always named as the fastest option here in the forums).
Below are the fio-reports from host and vm:
host:
Code:
fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sdb
seq_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=114MiB/s][r=29.1k IOPS][eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=10845: Sat Aug 27 09:12:51 2022
read: IOPS=26.7k, BW=104MiB/s (109MB/s)(6257MiB/60001msec)
slat (usec): min=4, max=864, avg=10.75, stdev= 7.14
clat (nsec): min=1188, max=4054.7k, avg=24274.13, stdev=16316.63
lat (usec): min=22, max=4067, avg=35.33, stdev=19.38
clat percentiles (usec):
| 1.00th=[ 19], 5.00th=[ 20], 10.00th=[ 20], 20.00th=[ 20],
| 30.00th=[ 20], 40.00th=[ 21], 50.00th=[ 22], 60.00th=[ 24],
| 70.00th=[ 26], 80.00th=[ 27], 90.00th=[ 33], 95.00th=[ 38],
| 99.00th=[ 48], 99.50th=[ 57], 99.90th=[ 90], 99.95th=[ 157],
| 99.99th=[ 457]
bw ( KiB/s): min=72464, max=134680, per=99.90%, avg=106667.93, stdev=19396.27, samples=119
iops : min=18116, max=33670, avg=26666.97, stdev=4849.06, samples=119
lat (usec) : 2=0.02%, 4=0.05%, 10=0.03%, 20=33.90%, 50=65.19%
lat (usec) : 100=0.72%, 250=0.04%, 500=0.03%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%
cpu : usr=17.03%, sys=40.15%, ctx=1598389, majf=0, minf=79
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=1601689,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=104MiB/s (109MB/s), 104MiB/s-104MiB/s (109MB/s-109MB/s), io=6257MiB (6561MB), run=60001-60001msec
Disk stats (read/write):
sdb: ios=1598574/3380, merge=0/938, ticks=39986/389, in_queue=8, util=99.90%
vm:
Code:
fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sda
seq_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=13.4MiB/s][r=3431 IOPS][eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=1026: Sat Aug 27 09:16:18 2022
read: IOPS=4365, BW=17.1MiB/s (17.9MB/s)(1023MiB/60001msec)
slat (usec): min=15, max=2782, avg=39.11, stdev=21.07
clat (usec): min=4, max=13912, avg=176.09, stdev=118.11
lat (usec): min=68, max=13933, avg=218.22, stdev=127.67
clat percentiles (usec):
| 1.00th=[ 78], 5.00th=[ 84], 10.00th=[ 92], 20.00th=[ 106],
| 30.00th=[ 120], 40.00th=[ 141], 50.00th=[ 167], 60.00th=[ 190],
| 70.00th=[ 210], 80.00th=[ 233], 90.00th=[ 265], 95.00th=[ 289],
| 99.00th=[ 412], 99.50th=[ 515], 99.90th=[ 1057], 99.95th=[ 1827],
| 99.99th=[ 4490]
bw ( KiB/s): min=12104, max=30016, per=100.00%, avg=17463.15, stdev=5151.57, samples=120
iops : min= 3026, max= 7504, avg=4365.78, stdev=1287.89, samples=120
lat (usec) : 10=0.01%, 20=0.01%, 50=0.01%, 100=16.06%, 250=70.26%
lat (usec) : 500=13.12%, 750=0.37%, 1000=0.07%
lat (msec) : 2=0.06%, 4=0.03%, 10=0.01%, 20=0.01%
cpu : usr=11.63%, sys=25.10%, ctx=262254, majf=0, minf=13
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=261963,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=17.1MiB/s (17.9MB/s), 17.1MiB/s-17.1MiB/s (17.9MB/s-17.9MB/s), io=1023MiB (1073MB), run=60001-60001msec
Disk stats (read/write):
sda: ios=261282/439, merge=0/95, ticks=47048/255, in_queue=60020, util=99.82%
Does anyone of you maybe have a good idea what can cause so much performance-loss between host and vms? Of course there are multiple vms running, but there shouldn't be so much read/writes in the other vms, that those single vm gets so slow.
Kind regards,
Sebastian
Last edited: