Hello,
we are using a single pve node with a zfs raid-z, where all the vms reside. We have 12 vms running, where RAM and CPU are still not at all stressed.
However we noticed that a lot of vms get blocked tasks, which is especially true for a gitlab-runner vm running docker, which repeatedly cannot connect to the docker engine.
After running the fio benchmark we saw write speeds of less than 400kb/s. This is sequential writes with 16k block (see command below). iostat shows writes of around 40M, which hopefully is not the upper limit I can expect from modern disks. Any hints on where to look would be apreciated.
System Information
Benchmark:
we are using a single pve node with a zfs raid-z, where all the vms reside. We have 12 vms running, where RAM and CPU are still not at all stressed.
However we noticed that a lot of vms get blocked tasks, which is especially true for a gitlab-runner vm running docker, which repeatedly cannot connect to the docker engine.
After running the fio benchmark we saw write speeds of less than 400kb/s. This is sequential writes with 16k block (see command below). iostat shows writes of around 40M, which hopefully is not the upper limit I can expect from modern disks. Any hints on where to look would be apreciated.
System Information
Hardware | |
Mainboard | Supermicro - X12DPi-NT6 |
CPU(s) | 2 x Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz |
Memory | 8x 32GB RAM - ATP X4B32QB4BNWESO-7-TN1 |
Disks |
|
HBA Controller | Broadcom / LSI 9500-8i Tri-Mode HBA |
Network |
|
Graphics | ASPEED Technology, Inc. ASPEED Graphics Family |
Software | |
Kernel Version | Linux 6.8.8-1-pve (2024-06-10T11:42Z) |
Boot Mode | EFI |
Manager Version | pve-manager/8.2.4/faa83925c9641325 |
KSM sharing | 0 B (Off) |
IO delay | < 1% |
Load average | 3.17,2.35,2.25 |
Benchmark:
Code:
# fio --filename=test --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=4
fio-3.33
Starting 1 process
test: Laying out IO file (1 file / 10240MiB)
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
Jobs: 1 (f=1): [W(1)][100.0%][w=56KiB/s][w=14 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=250809: Thu Jun 20 17:11:08 2024
write: IOPS=60, BW=241KiB/s (247kB/s)(70.7MiB/300007msec); 0 zone resets
clat (msec): min=2, max=2232, avg=16.58, stdev=43.58
lat (msec): min=2, max=2232, avg=16.58, stdev=43.58
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 6], 50.00th=[ 8], 60.00th=[ 11],
| 70.00th=[ 15], 80.00th=[ 22], 90.00th=[ 37], 95.00th=[ 52],
| 99.00th=[ 113], 99.50th=[ 176], 99.90th=[ 542], 99.95th=[ 995],
| 99.99th=[ 1787]
bw ( KiB/s): min= 8, max= 912, per=100.00%, avg=249.30, stdev=195.70, samples=580
iops : min= 2, max= 228, avg=62.32, stdev=48.92, samples=580
lat (msec) : 4=30.94%, 10=28.59%, 20=18.42%, 50=16.74%, 100=4.10%
lat (msec) : 250=0.92%, 500=0.17%, 750=0.03%, 1000=0.03%, 2000=0.04%
lat (msec) : >=2000=0.01%
cpu : usr=0.02%, sys=0.18%, ctx=18518, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,18089,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=4
Run status group 0 (all jobs):
WRITE: bw=277KiB/s (284kB/s), 277KiB/s-277KiB/s (284kB/s-284kB/s), io=81.2MiB (85.2MB), run=300039-300039msec
Attachments
Last edited: