Best way to identify processes causing IO delay and power consumption?

rakali

Active Member
Jan 2, 2020
42
5
28
42
Hello,

I guess this is not strictly a proxmox exclusive question, but I am interested exclusively in my proxmox machine and I am hoping to benefit from the experienced users here.

I recently measured my box draws 300W at the wall which is higher than expected, and I see in the proxmox GUI that IO delay hovers around 30%.

I would like to identify the primary drivers of these values.

My machine is a Z390 motherboard with an 8700K processor. 128GB of RAM. 2 x HBA and a X520 NIC. There are 30 x SATA HDD and a couple of NVME drives. 8 HDD are for running VM/CT and the rest are NAS.
That would be about 500W at maximum load, but my system is mostly idle, as evidenced by the Proxmox GUI telling me that CPU usage is about 20%. Is my understanding of expected idle power consumption way off?

As an example of what I thought might be possible is the MacOS Activity Monitor Energy tab:

Energy-Tab-Activity-Monitor-.jpg

For the IO delay, I have tried reading various forum threads and googling but without success. Can anyone offer their advice on following the thread what is causing the IO Delay?

I have my VMs and CTs on a pool of 8 x HDD mirrored in 4 pairs and performance seems otherwise normal to me.

Thank you in advance for any advice or guidance you can offer!

Code:
root@sapin:~# zpool iostat tank
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank    50.9T  1.71T  1.45K    353   504M  22.9M

Code:
root@sapin:~# fio --name=benchmark --size=1G --filename=/tank/zfs/fiotest --bs=4k --ioengine=libaio --iodepth=4 --rw=randrw --rwmixread=50 --direct=1 --numjobs=1 --runtime=30 --group_reporting
benchmark: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=4
fio-3.33
Starting 1 process
benchmark: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1)
benchmark: (groupid=0, jobs=1): err= 0: pid=2267195: Mon Jan 15 12:49:32 2024
  read: IOPS=102k, BW=399MiB/s (418MB/s)(512MiB/1284msec)
    slat (nsec): min=1443, max=1179.7k, avg=3422.49, stdev=7127.29
    clat (nsec): min=665, max=1205.2k, avg=14924.92, stdev=15356.25
     lat (usec): min=3, max=1208, avg=18.35, stdev=19.00
    clat percentiles (usec):
     |  1.00th=[    9],  5.00th=[    9], 10.00th=[   10], 20.00th=[   11],
     | 30.00th=[   11], 40.00th=[   12], 50.00th=[   13], 60.00th=[   13],
     | 70.00th=[   14], 80.00th=[   15], 90.00th=[   17], 95.00th=[   33],
     | 99.00th=[   85], 99.50th=[  105], 99.90th=[  153], 99.95th=[  176],
     | 99.99th=[  314]
   bw (  KiB/s): min=247168, max=503512, per=91.94%, avg=375340.00, stdev=181262.58, samples=2
   iops        : min=61792, max=125878, avg=93835.00, stdev=45315.65, samples=2
  write: IOPS=102k, BW=399MiB/s (418MB/s)(512MiB/1284msec); 0 zone resets
    slat (usec): min=2, max=679, avg= 5.11, stdev= 6.46
    clat (usec): min=4, max=1205, avg=15.08, stdev=14.95
     lat (usec): min=7, max=1209, avg=20.19, stdev=18.62
    clat percentiles (usec):
     |  1.00th=[    9],  5.00th=[   10], 10.00th=[   10], 20.00th=[   11],
     | 30.00th=[   12], 40.00th=[   12], 50.00th=[   13], 60.00th=[   13],
     | 70.00th=[   14], 80.00th=[   15], 90.00th=[   17], 95.00th=[   33],
     | 99.00th=[   85], 99.50th=[  104], 99.90th=[  149], 99.95th=[  169],
     | 99.99th=[  285]
   bw (  KiB/s): min=247304, max=500936, per=91.60%, avg=374120.00, stdev=179344.91, samples=2
   iops        : min=61826, max=125234, avg=93530.00, stdev=44836.23, samples=2
  lat (nsec)   : 750=0.01%
  lat (usec)   : 10=11.97%, 20=81.55%, 50=3.48%, 100=2.44%, 250=0.55%
  lat (usec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu          : usr=14.81%, sys=83.16%, ctx=295, majf=4, minf=15
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=131040,131104,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4


Run status group 0 (all jobs):
   READ: bw=399MiB/s (418MB/s), 399MiB/s-399MiB/s (418MB/s-418MB/s), io=512MiB (537MB), run=1284-1284msec
  WRITE: bw=399MiB/s (418MB/s), 399MiB/s-399MiB/s (418MB/s-418MB/s), io=512MiB (537MB), run=1284-1284msec


Code:
root@sapin:~# pveversion
pve-manager/8.1.3/b46aac3b42da5d15 (running kernel: 6.5.11-7-pve)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!