High load with very simple disk operations on single VM

perryflynn

Active Member
Aug 26, 2016
4
0
41
34
Hello,

for a long time (personal home lab, machine from 2016, I don't know how long the problem exist, I just ignored it until now), I have the problem, that one single VM on my system is able to let the load explode to >40 and let other VMs crash. Just with disk operations.

I used fio to test this:

Bash:
#!/bin/bash
# commands stolen from https://www.thomas-krenn.com/de/wiki/Fio_Grundlagen#Beispiele
cd "$(dirname "$0")"
FILE=testfio

echo "#-> IOPS write"
fio --rw=randwrite --name=IOPS-write --bs=4k --direct=1 --filename=$FILE --numjobs=4 --ioengine=libaio --iodepth=32 --refill_buffers --group_reporting --runtime=60 --time_based

echo "#-> IOPS read"
fio --rw=randread --name=IOPS-read --bs=4k --direct=1 --filename=$FILE --numjobs=4 --ioengine=libaio --iodepth=32 --refill_buffers --group_reporting --runtime=60 --time_based

echo "#-> BW write"
fio --rw=write --name=IOPS-write --bs=1024k --direct=1 --filename=$FILE --numjobs=4 --ioengine=libaio --iodepth=32 --refill_buffers --group_reporting --runtime=60 --time_based

echo "#-> BW read"
fio --rw=read --name=IOPS-read --bs=1024k --direct=1 --filename=$FILE --numjobs=4 --ioengine=libaio --iodepth=32 --refill_buffers --group_reporting --runtime=60 --time_based

With the following results inside of a KVM guest:

Code:
#-> IOPS write
  write: IOPS=8441, BW=32.0MiB/s (34.6MB/s)(1982MiB/60117msec); 0 zone resets
   bw (  KiB/s): min= 4744, max=66656, per=100.00%, avg=33880.76, stdev=4181.09, samples=477
   iops        : min= 1186, max=16664, avg=8470.16, stdev=1045.28, samples=477

#-> IOPS read
  read: IOPS=47.0k, BW=184MiB/s (193MB/s)(10.8GiB/60006msec)
   bw (  KiB/s): min=14864, max=493763, per=99.33%, avg=186782.31, stdev=37957.10, samples=476
   iops        : min= 3716, max=123440, avg=46695.52, stdev=9489.25, samples=476

#-> BW write
  write: IOPS=258, BW=258MiB/s (271MB/s)(15.4GiB/60966msec); 0 zone resets
   bw (  KiB/s): min= 8177, max=4812813, per=100.00%, avg=461513.17, stdev=258221.06, samples=278
   iops        : min=    5, max= 4699, avg=450.52, stdev=252.13, samples=278

#-> BW read
  read: IOPS=1748, BW=1749MiB/s (1834MB/s)(103GiB/60021msec)
   bw (  MiB/s): min=   32, max= 6756, per=98.80%, avg=1727.89, stdev=563.99, samples=476
   iops        : min=   32, max= 6755, avg=1727.36, stdev=563.81, samples=476

On this test, the load avg explodes to 42.

And the following results on a zfs filesystem on the proxmox host:

Code:
#-> IOPS write
  write: IOPS=276, BW=1108KiB/s (1135kB/s)(64.0MiB/60034msec); 0 zone resets
   bw (  KiB/s): min=   48, max= 2208, per=100.00%, avg=1109.63, stdev=135.10, samples=473
   iops        : min=   12, max=  552, avg=277.41, stdev=33.78, samples=473

#-> IOPS read
  read: IOPS=904, BW=3620KiB/s (3707kB/s)(212MiB/60008msec)
   bw (  KiB/s): min=  536, max=132408, per=99.78%, avg=3612.91, stdev=2972.37, samples=476
   iops        : min=  134, max=33102, avg=903.23, stdev=743.09, samples=476

#-> BW write
  write: IOPS=677, BW=677MiB/s (710MB/s)(39.7GiB/60007msec); 0 zone resets
   bw (  KiB/s): min=368640, max=6230016, per=100.00%, avg=693682.65, stdev=225160.16, samples=476
   iops        : min=  360, max= 6084, avg=677.40, stdev=219.88, samples=476

#-> BW read
  read: IOPS=8674, BW=8674MiB/s (9095MB/s)(508GiB/60001msec)
   bw (  MiB/s): min=  416, max=14756, per=100.00%, avg=8683.59, stdev=611.46, samples=476
   iops        : min=  416, max=14756, avg=8683.61, stdev=611.48, samples=476

On this test, the load avg has a maximum of just 8.5.

Machine Info:
  • Proxmox 7.3-3
  • Intel Xeon E3-1260 (8 CPU)
  • 64GB memory
  • 2x 6TB mirror + 2x 8TB mirror in one ZFS pool, 8k block size, thin provisioning active
  • All WD Red Pro Disks, ashift=12

Configuration for the machine which executed fio (I added the disk limits later to workaround es issue):

1681816557656.png

Cache is set to "Default (No Cache)".

Other tasks like vzdump exhausting the IO as well. I reduced the worker count to 1 to "solve" this.

IO Scheduler is "mq-deadline" on all disks. Only alternative available would be "none".

Kernel:
Linux eisenbart 6.2.9-1-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.9-1 (2023-03-31T10:48Z) x86_64 GNU/Linux

SMART values from all disks are ok.

Most of the VMs have VirtIO SCSI as controller configured, two have LSI53C895A as controller because I missed to change that on VM creation. These two crash on high load. Does it make sense to switch to SCSI single + iothread?

I would expect that the Kernel/Scheduler takes care that one process not breaks the whole system. But this is not the case. I added now limits to all VM disks like in the screenshot.

Is there a better solution?

Or should I think about an hardware upgrade?

Regards
Christian
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!