Different Disk performance on Identical Hardware

Integratinator

New Member
Oct 17, 2024
2
0
1
I am seeing a huge difference in write performance on my proxmox hosts in a single cluster.
They are both running the same version of proxmox (8.2.5) and have the same hardware. (HP DL380 G8 with 2x Xeon E5-2670 and 128GB RAM, 2x 1TB HDD in RAID1)
I have run fio on both hosts with the same parameters and the results are very different (usec on first host and msec on second).
Host 1:
Bash:
/var/lib/vz/dump# fio --rw=write --ioengine=sync --fdatasync=1 --directory=write-test --size=100m --bs=2300 --name=kube-storage-test
Bash:
kube-storage-test: (g=0): rw=write, bs=(R) 2300B-2300B, (W) 2300B-2300B, (T) 2300B-2300B, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
kube-storage-test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=15.9MiB/s][w=7231 IOPS][eta 00m:00s]
kube-storage-test: (groupid=0, jobs=1): err= 0: pid=3066826: Mon Oct 21 12:49:25 2024
  write: IOPS=7570, BW=16.6MiB/s (17.4MB/s)(100.0MiB/6022msec); 0 zone resets
    clat (usec): min=3, max=599, avg= 9.16, stdev= 5.99
     lat (usec): min=4, max=599, avg= 9.52, stdev= 6.13
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    5], 10.00th=[    5], 20.00th=[    5],
     | 30.00th=[    7], 40.00th=[    9], 50.00th=[    9], 60.00th=[   10],
     | 70.00th=[   10], 80.00th=[   13], 90.00th=[   15], 95.00th=[   16],
     | 99.00th=[   22], 99.50th=[   26], 99.90th=[   50], 99.95th=[   61],
     | 99.99th=[  269]
   bw (  KiB/s): min=15722, max=17946, per=100.00%, avg=17018.50, stdev=747.14, samples=12
   iops        : min= 7000, max= 7990, avg=7577.17, stdev=332.64, samples=12
  lat (usec)   : 4=2.74%, 10=68.21%, 20=27.20%, 50=1.76%, 100=0.07%
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=39, max=1051, avg=119.50, stdev=65.44
    sync percentiles (usec):
     |  1.00th=[   42],  5.00th=[   42], 10.00th=[   43], 20.00th=[   45],
     | 30.00th=[   53], 40.00th=[   60], 50.00th=[  153], 60.00th=[  159],
     | 70.00th=[  169], 80.00th=[  180], 90.00th=[  192], 95.00th=[  204],
     | 99.00th=[  231], 99.50th=[  241], 99.90th=[  330], 99.95th=[  449],
     | 99.99th=[  570]
  cpu          : usr=5.41%, sys=35.18%, ctx=96172, majf=0, minf=39
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,45590,0,0 short=45590,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
  WRITE: bw=16.6MiB/s (17.4MB/s), 16.6MiB/s-16.6MiB/s (17.4MB/s-17.4MB/s), io=100.0MiB (105MB), run=6022-6022msec
Disk stats (read/write):
    dm-1: ios=0/144862, merge=0/0, ticks=0/4904, in_queue=4904, util=57.56%, aggrios=4/96983, aggrmerge=0/51267, aggrticks=0/3550, aggrin_queue=3551, aggrutil=58.74%
  sda: ios=4/96983, merge=0/51267, ticks=0/3550, in_queue=3551, util=58.74%
Host 2:
Bash:
/var/lib/vz/dump# fio --rw=write --ioengine=sync --fdatasync=1 --directory=write-test --size=100m --bs=2300 --name=kube-storage-test
Bash:
Starting 1 process
kube-storage-test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1): [W(1)][99.9%][w=69KiB/s][w=31 IOPS][eta 00m:01s] 
kube-storage-test: (groupid=0, jobs=1): err= 0: pid=2769123: Mon Oct 21 13:06:33 2024
  write: IOPS=45, BW=102KiB/s (105kB/s)(100.0MiB/1002846msec); 0 zone resets
    clat (usec): min=8, max=55195, avg=31.18, stdev=314.75
     lat (usec): min=9, max=55195, avg=32.07, stdev=314.76
    clat percentiles (usec):
     |  1.00th=[   12],  5.00th=[   14], 10.00th=[   16], 20.00th=[   19],
     | 30.00th=[   21], 40.00th=[   23], 50.00th=[   25], 60.00th=[   28],
     | 70.00th=[   32], 80.00th=[   37], 90.00th=[   44], 95.00th=[   51],
     | 99.00th=[   65], 99.50th=[   72], 99.90th=[   90], 99.95th=[  277],
     | 99.99th=[ 9896]
   bw (  KiB/s): min=   26, max=  148, per=98.91%, avg=101.72, stdev=17.74, samples=2005
   iops        : min=   12, max=   66, avg=45.47, stdev= 7.88, samples=2005
  lat (usec)   : 10=0.04%, 20=28.79%, 50=65.89%, 100=5.21%, 250=0.02%
  lat (usec)   : 500=0.03%
  lat (msec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
  fsync/fdatasync/sync_file_range:
    sync (msec): min=4, max=299, avg=21.96, stdev=16.54
    sync percentiles (msec):
     |  1.00th=[    7],  5.00th=[    8], 10.00th=[    9], 20.00th=[   10],
     | 30.00th=[   11], 40.00th=[   18], 50.00th=[   21], 60.00th=[   22],
     | 70.00th=[   24], 80.00th=[   29], 90.00th=[   43], 95.00th=[   55],
     | 99.00th=[   79], 99.50th=[   94], 99.90th=[  155], 99.95th=[  182],
     | 99.99th=[  249]
  cpu          : usr=0.08%, sys=0.45%, ctx=96953, majf=0, minf=43
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,45590,0,0 short=45590,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
  WRITE: bw=102KiB/s (105kB/s), 102KiB/s-102KiB/s (105kB/s-105kB/s), io=100.0MiB (105MB), run=1002846-1002846msec
Disk stats (read/write):
    dm-1: ios=0/158286, merge=0/0, ticks=0/1787031, in_queue=1787031, util=99.49%, aggrios=716/132705, aggrmerge=0/54666, aggrticks=6171/1981263, aggrin_queue=1987434, aggrutil=95.09%
  sda: ios=716/132705, merge=0/54666, ticks=6171/1981263, in_queue=1987434, util=95.09%
Any idea why this might be happening? I have checked the RAID controller and it is in good health.
The VMs are running fine on both hosts, but those disks are stored on a Ceph cluster.
I want to use the local disks to store the etcd of a kubernetes cluster, preferably on SSD (not yet installed).
But the difference in write performance is making me doubt if an SSD would be worth it. If the problem is not the disks, but something else.
 
I have checked the RAID controller and it is in good health.
Same RAID controller on both hosts? Same firmware?

How do BIOS versions compare for both hosts.

What is the condition/integrity of the physical HDDs in question. Maybe compare them against the other hosts.
If you want to be really creative, swap the disks between the 2 servers ("same hardware"!) & see what output you get then.

Another thing - were the host/s "busy" with other tasks while testing. I'd only run the test without any other VMs/LXCs running at the time.
 
Same RAID controller on both hosts? Same firmware?

How do BIOS versions compare for both hosts.

What is the condition/integrity of the physical HDDs in question. Maybe compare them against the other hosts.
If you want to be really creative, swap the disks between the 2 servers ("same hardware"!) & see what output you get then.

Another thing - were the host/s "busy" with other tasks while testing. I'd only run the test without any other VMs/LXCs running at the time.
Yes same version of raid controller firmware and Bios. According to the HP iLO the state and integrity of the disks is healthy.

I did first run the test with VMs running on the hosts. However for the second host I shut everything down and retested. That is the output in my first post. I also found out that without the syncdata=1 flag. The fio test finishes immediately.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!