io_uring on kernel 7.0.6-2-pve (PVE 9.2.3): guest disk I/O errors (EIO) + filesystem XFS shutdown

could be an issue on either end, or in the combination of both. please keep us posted if any new information comes up!
 
By the way, I also reproduced the issue on AlmaLinux 10.1 (kernel 6.12.0-124.52.1.el10_1.x86_64) with this command :
Bash:
fio --name=stress-hdd --ioengine=libaio --iodepth=8 --rw=randwrite \
    --bs=4k --direct=1 --fdatasync=8 --size=20G --numjobs=4 \
    --runtime=1800 --time_based --ramp_time=30 --group_reporting \
    --filename=/tmp/fio-test --output-format=normal,json \
    --output=fio-result.json

However, it seems to last longer / be more complicated to reproduce on this kernel version.

Is anyone on the Proxmox team able to reproduce the problem with a configuration similar to mine?

EDIT

I would like to add that impacted VMs have one less tag on their disk than unaffected VMs :
1781804954945.png

Healthy VMs (until now) :
1781804984402.png

The format=raw may impact something ? I don't know what is the difference.
 
Last edited:
Some news :

I was able to reboot one of the both Proxmox host on Kernel 6.17.13-13-pve and retry the same process / same stress write test (30 minutes of stress).

I confirm that there is no issue on this kernel version.

So it seems to be a regression on the last Proxmox Linux kernel version (7.X or 7.0.6-2-pve).

It would be great if someone from Proxmox staff could try to reproduce the problem or give me more commands / info on debugging.
 
  • Like
Reactions: waltar
The problem is also triggered on last kernel 7.0.12-1-pve.

Bash:
[root@patchmon ~]# fio --name=stress-hdd --ioengine=libaio --iodepth=8 --rw=randwrite     --bs=4k --direct=1 --fdatasync=8 --size=20G --numjobs=8
--runtime=1800 --time_based --ramp_time=30 --group_reporting     --filename=/tmp/fio-test --output-format=normal,json     --output=fio-result.json
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=3303440384, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=15728369664, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=12987797504, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=4300693504, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=3548835840, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=781250560, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=1649344512, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=4513239040, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=19759464448, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=3274137600, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=17922351104, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=19376713728, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=93745152, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=14806507520, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=19999735808, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=11025801216, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=14361858048, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=16999051264, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20181307392, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=16825335808, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=15835607040, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=11907469312, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20892581888, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13690458112, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=10695925760, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=10553192448, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20011687936, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20895113216, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=17017688064, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13688827904, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=15695372288, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13781270528, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20881649664, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13658992640, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13343629312, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=14113169408, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=14118154240, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=14968557568, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=13344370688, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=20108697600, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=18358050816, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=10622705664, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=565350400, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=9192083456, buflen=4096
fio: io_u error on file /tmp/fio-test: Input/output error: write offset=18820165632, buflen=4096
[root@patchmon ~]#

It takes more testing time to trigger it.
 
Last edited:
Answer from AlmaLinux support :

Hello,

Thank you for the additional updates.

Based on the latest test results, this looks more like a regression in the Proxmox 7.x host kernel rather than an AlmaLinux 10.2 guest-only issue.

The key point is that the same VM and the same fio stress-write workload did not reproduce the issue when the Proxmox host was booted with 6.17.13-13-pve, while the issue appears with 7.0.x-pve kernels such as 7.0.6-2-pve and 7.0.12-1-pve.

Also, since the problem can now be reproduced with AlmaLinux 10.1 as well, AlmaLinux 10.2 no longer seems to be the only trigger.

From the guest side, XFS appears to be shutting down as a consequence of receiving EIO / Aborted Command from the virtual block device.

So I think it would be useful to focus the debugging on the Proxmox host kernel / QEMU / virtio-scsi / LVM-thin stack.
 
thanks for the additional information! let's continue over in the BZ entry!