Disk-Performance with HW-RAID and LVM.

YellowPhoenix18

Active Member
Mar 26, 2019
5
1
43
25
Hello,
I currently have a configuration with a Proxmox-Server having a 2 SAS-SSDS (Samsung PM 1643 each up to 400k IOPS read - See here).
Sadly inside of Proxmox I experience a pretty bad performance in read of only 30k IOPS behind the HW-RAID (HPE P420) and inside of the VMs of only around 4k IOPS.

So probably there two problems, as i cant explain to myself, why "up to 400k IOPS" on 2 disks can only result in 30k IOPS behind a RAID 1, as read should basically use both disks for lookup. That write is slowed down with a RAID 1 is obviously, as only the worst performance of both disks defines the speed/IOPS.

But aside from the problem with the HW-RAID I still experience only 4k IOPS when checking with fio on the VM while the host-system reaches around 30k IOPS in avg around 27k IOPS on the host system. As my "sdb" should already be the LVM it must be anywhere between the proxmox-host and the connector for the VM. But I also checked all configuration and as far as I can see the connection of the disk into the vm seems fine (SCSI BUS + VirtIO SCSI-Controller, which was always named as the fastest option here in the forums).

Below are the fio-reports from host and vm:

host:
Code:
fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sdb
seq_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=114MiB/s][r=29.1k IOPS][eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=10845: Sat Aug 27 09:12:51 2022
  read: IOPS=26.7k, BW=104MiB/s (109MB/s)(6257MiB/60001msec)
    slat (usec): min=4, max=864, avg=10.75, stdev= 7.14
    clat (nsec): min=1188, max=4054.7k, avg=24274.13, stdev=16316.63
     lat (usec): min=22, max=4067, avg=35.33, stdev=19.38
    clat percentiles (usec):
     |  1.00th=[   19],  5.00th=[   20], 10.00th=[   20], 20.00th=[   20],
     | 30.00th=[   20], 40.00th=[   21], 50.00th=[   22], 60.00th=[   24],
     | 70.00th=[   26], 80.00th=[   27], 90.00th=[   33], 95.00th=[   38],
     | 99.00th=[   48], 99.50th=[   57], 99.90th=[   90], 99.95th=[  157],
     | 99.99th=[  457]
   bw (  KiB/s): min=72464, max=134680, per=99.90%, avg=106667.93, stdev=19396.27, samples=119
   iops        : min=18116, max=33670, avg=26666.97, stdev=4849.06, samples=119
  lat (usec)   : 2=0.02%, 4=0.05%, 10=0.03%, 20=33.90%, 50=65.19%
  lat (usec)   : 100=0.72%, 250=0.04%, 500=0.03%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
  cpu          : usr=17.03%, sys=40.15%, ctx=1598389, majf=0, minf=79
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1601689,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=104MiB/s (109MB/s), 104MiB/s-104MiB/s (109MB/s-109MB/s), io=6257MiB (6561MB), run=60001-60001msec

Disk stats (read/write):
  sdb: ios=1598574/3380, merge=0/938, ticks=39986/389, in_queue=8, util=99.90%

vm:
Code:
fio --ioengine=libaio --direct=1 --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --name seq_read --filename=/dev/sda
seq_read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.12
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=13.4MiB/s][r=3431 IOPS][eta 00m:00s]
seq_read: (groupid=0, jobs=1): err= 0: pid=1026: Sat Aug 27 09:16:18 2022
  read: IOPS=4365, BW=17.1MiB/s (17.9MB/s)(1023MiB/60001msec)
    slat (usec): min=15, max=2782, avg=39.11, stdev=21.07
    clat (usec): min=4, max=13912, avg=176.09, stdev=118.11
     lat (usec): min=68, max=13933, avg=218.22, stdev=127.67
    clat percentiles (usec):
     |  1.00th=[   78],  5.00th=[   84], 10.00th=[   92], 20.00th=[  106],
     | 30.00th=[  120], 40.00th=[  141], 50.00th=[  167], 60.00th=[  190],
     | 70.00th=[  210], 80.00th=[  233], 90.00th=[  265], 95.00th=[  289],
     | 99.00th=[  412], 99.50th=[  515], 99.90th=[ 1057], 99.95th=[ 1827],
     | 99.99th=[ 4490]
   bw (  KiB/s): min=12104, max=30016, per=100.00%, avg=17463.15, stdev=5151.57, samples=120
   iops        : min= 3026, max= 7504, avg=4365.78, stdev=1287.89, samples=120
  lat (usec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=16.06%, 250=70.26%
  lat (usec)   : 500=13.12%, 750=0.37%, 1000=0.07%
  lat (msec)   : 2=0.06%, 4=0.03%, 10=0.01%, 20=0.01%
  cpu          : usr=11.63%, sys=25.10%, ctx=262254, majf=0, minf=13
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=261963,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=17.1MiB/s (17.9MB/s), 17.1MiB/s-17.1MiB/s (17.9MB/s-17.9MB/s), io=1023MiB (1073MB), run=60001-60001msec

Disk stats (read/write):
  sda: ios=261282/439, merge=0/95, ticks=47048/255, in_queue=60020, util=99.82%



Does anyone of you maybe have a good idea what can cause so much performance-loss between host and vms? Of course there are multiple vms running, but there shouldn't be so much read/writes in the other vms, that those single vm gets so slow.

Kind regards,
Sebastian
 
Last edited:
Hey,

For host situation, no really answer for you .... But, for yours VMs, check if:
- cache on writeback
- advances disk : SSD emulation, IO treads, asyncIO = treads


Cordially,