Different Disk performance on Identical Hardware

Integratinator · Oct 21, 2024

I am seeing a huge difference in write performance on my proxmox hosts in a single cluster.
They are both running the same version of proxmox (8.2.5) and have the same hardware. (HP DL380 G8 with 2x Xeon E5-2670 and 128GB RAM, 2x 1TB HDD in RAID1)
I have run fio on both hosts with the same parameters and the results are very different (usec on first host and msec on second).
Host 1:

Bash:

/var/lib/vz/dump# fio --rw=write --ioengine=sync --fdatasync=1 --directory=write-test --size=100m --bs=2300 --name=kube-storage-test

Bash:

kube-storage-test: (g=0): rw=write, bs=(R) 2300B-2300B, (W) 2300B-2300B, (T) 2300B-2300B, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
kube-storage-test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=15.9MiB/s][w=7231 IOPS][eta 00m:00s]
kube-storage-test: (groupid=0, jobs=1): err= 0: pid=3066826: Mon Oct 21 12:49:25 2024
  write: IOPS=7570, BW=16.6MiB/s (17.4MB/s)(100.0MiB/6022msec); 0 zone resets
    clat (usec): min=3, max=599, avg= 9.16, stdev= 5.99
     lat (usec): min=4, max=599, avg= 9.52, stdev= 6.13
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    5], 10.00th=[    5], 20.00th=[    5],
     | 30.00th=[    7], 40.00th=[    9], 50.00th=[    9], 60.00th=[   10],
     | 70.00th=[   10], 80.00th=[   13], 90.00th=[   15], 95.00th=[   16],
     | 99.00th=[   22], 99.50th=[   26], 99.90th=[   50], 99.95th=[   61],
     | 99.99th=[  269]
   bw (  KiB/s): min=15722, max=17946, per=100.00%, avg=17018.50, stdev=747.14, samples=12
   iops        : min= 7000, max= 7990, avg=7577.17, stdev=332.64, samples=12
  lat (usec)   : 4=2.74%, 10=68.21%, 20=27.20%, 50=1.76%, 100=0.07%
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=39, max=1051, avg=119.50, stdev=65.44
    sync percentiles (usec):
     |  1.00th=[   42],  5.00th=[   42], 10.00th=[   43], 20.00th=[   45],
     | 30.00th=[   53], 40.00th=[   60], 50.00th=[  153], 60.00th=[  159],
     | 70.00th=[  169], 80.00th=[  180], 90.00th=[  192], 95.00th=[  204],
     | 99.00th=[  231], 99.50th=[  241], 99.90th=[  330], 99.95th=[  449],
     | 99.99th=[  570]
  cpu          : usr=5.41%, sys=35.18%, ctx=96172, majf=0, minf=39
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,45590,0,0 short=45590,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
  WRITE: bw=16.6MiB/s (17.4MB/s), 16.6MiB/s-16.6MiB/s (17.4MB/s-17.4MB/s), io=100.0MiB (105MB), run=6022-6022msec
Disk stats (read/write):
    dm-1: ios=0/144862, merge=0/0, ticks=0/4904, in_queue=4904, util=57.56%, aggrios=4/96983, aggrmerge=0/51267, aggrticks=0/3550, aggrin_queue=3551, aggrutil=58.74%
  sda: ios=4/96983, merge=0/51267, ticks=0/3550, in_queue=3551, util=58.74%

Host 2:

Bash:

/var/lib/vz/dump# fio --rw=write --ioengine=sync --fdatasync=1 --directory=write-test --size=100m --bs=2300 --name=kube-storage-test

Bash:

Starting 1 process
kube-storage-test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1): [W(1)][99.9%][w=69KiB/s][w=31 IOPS][eta 00m:01s] 
kube-storage-test: (groupid=0, jobs=1): err= 0: pid=2769123: Mon Oct 21 13:06:33 2024
  write: IOPS=45, BW=102KiB/s (105kB/s)(100.0MiB/1002846msec); 0 zone resets
    clat (usec): min=8, max=55195, avg=31.18, stdev=314.75
     lat (usec): min=9, max=55195, avg=32.07, stdev=314.76
    clat percentiles (usec):
     |  1.00th=[   12],  5.00th=[   14], 10.00th=[   16], 20.00th=[   19],
     | 30.00th=[   21], 40.00th=[   23], 50.00th=[   25], 60.00th=[   28],
     | 70.00th=[   32], 80.00th=[   37], 90.00th=[   44], 95.00th=[   51],
     | 99.00th=[   65], 99.50th=[   72], 99.90th=[   90], 99.95th=[  277],
     | 99.99th=[ 9896]
   bw (  KiB/s): min=   26, max=  148, per=98.91%, avg=101.72, stdev=17.74, samples=2005
   iops        : min=   12, max=   66, avg=45.47, stdev= 7.88, samples=2005
  lat (usec)   : 10=0.04%, 20=28.79%, 50=65.89%, 100=5.21%, 250=0.02%
  lat (usec)   : 500=0.03%
  lat (msec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
  fsync/fdatasync/sync_file_range:
    sync (msec): min=4, max=299, avg=21.96, stdev=16.54
    sync percentiles (msec):
     |  1.00th=[    7],  5.00th=[    8], 10.00th=[    9], 20.00th=[   10],
     | 30.00th=[   11], 40.00th=[   18], 50.00th=[   21], 60.00th=[   22],
     | 70.00th=[   24], 80.00th=[   29], 90.00th=[   43], 95.00th=[   55],
     | 99.00th=[   79], 99.50th=[   94], 99.90th=[  155], 99.95th=[  182],
     | 99.99th=[  249]
  cpu          : usr=0.08%, sys=0.45%, ctx=96953, majf=0, minf=43
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,45590,0,0 short=45590,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
  WRITE: bw=102KiB/s (105kB/s), 102KiB/s-102KiB/s (105kB/s-105kB/s), io=100.0MiB (105MB), run=1002846-1002846msec
Disk stats (read/write):
    dm-1: ios=0/158286, merge=0/0, ticks=0/1787031, in_queue=1787031, util=99.49%, aggrios=716/132705, aggrmerge=0/54666, aggrticks=6171/1981263, aggrin_queue=1987434, aggrutil=95.09%
  sda: ios=716/132705, merge=0/54666, ticks=6171/1981263, in_queue=1987434, util=95.09%

Any idea why this might be happening? I have checked the RAID controller and it is in good health.
The VMs are running fine on both hosts, but those disks are stored on a Ceph cluster.
I want to use the local disks to store the etcd of a kubernetes cluster, preferably on SSD (not yet installed).
But the difference in write performance is making me doubt if an SSD would be worth it. If the problem is not the disks, but something else.

_gabriel · Oct 21, 2024

Integratinator said:
have checked the RAID controller

both have the "Flash Backed Write Cache"? with same size ? same write ratio ?

gfngfn256 · Oct 21, 2024

Integratinator said:
I have checked the RAID controller and it is in good health.

Same RAID controller on both hosts? Same firmware?

How do BIOS versions compare for both hosts.

What is the condition/integrity of the physical HDDs in question. Maybe compare them against the other hosts.
If you want to be really creative, swap the disks between the 2 servers ("same hardware"!) & see what output you get then.

Another thing - were the host/s "busy" with other tasks while testing. I'd only run the test without any other VMs/LXCs running at the time.

Integratinator · Oct 21, 2024

gfngfn256 said:
Same RAID controller on both hosts? Same firmware?

How do BIOS versions compare for both hosts.

What is the condition/integrity of the physical HDDs in question. Maybe compare them against the other hosts.
If you want to be really creative, swap the disks between the 2 servers ("same hardware"!) & see what output you get then.

Another thing - were the host/s "busy" with other tasks while testing. I'd only run the test without any other VMs/LXCs running at the time.

Yes same version of raid controller firmware and Bios. According to the HP iLO the state and integrity of the disks is healthy.

I did first run the test with VMs running on the hosts. However for the second host I shut everything down and retested. That is the output in my first post. I also found out that without the syncdata=1 flag. The fio test finishes immediately.

Search

Search

Different Disk performance on Identical Hardware

Integratinator

New Member

_gabriel

Famous Member

gfngfn256

Distinguished Member

Integratinator

New Member

We value your privacy