io_uring feedback

chrcoluk · Jan 27, 2023

what kind of data corruption? Could you provide a few more details about the configuration, i.e. what storage was used, what disk controllers, disk settings? Best to open a separate thread an mention me there with @fiona

@fiona

Hi

The confirmed corruption was on the boot files in a windows guest, it happened multiple times after upgrading from proxmox 6.x to 7.x, and I then noticed a new default using io_uring, as soon asI changed it to aio native, the problems stopped and stayed stopped.

I then setup a new windows VM using io_ring purely to test if the problem would come back, this was using different physical drives as well. The boot files got corrupted again.

VM configuration.

6 gig ram no balloon
1 socket 4 cores cpu type EPYC
seabios
virtio gpu 16m
q35 machine
virtio scsi
zvol drive 50g size, ssd=1, discard on, cache=none, throttled to 30000 i/o writes, 500mB/sec writes, the zvol is 64k volblocksize. the pool is a zfs mirror with 2 ssd's. no smart errors, no scrub errors.

proxmox 7.1-12. so needs updating, out of my 3 proxmox, this one I updated to 7.x first, I can upgrade to 7.3 and retest on io_uring.

chrcoluk · Jan 29, 2023

Found this, I can confirm 'detect-zeroes=unmap' is configured by proxmox on the VM and this is an issue affecting even the latest qemu 7.2 according to those discussions. However not using virtio block device, using virtio scsi instead.

https://gitlab.com/qemu-project/qemu/-/issues/1404

According to this its only useful for legacy OS's that have no native unmap/trim support and can be compute intensive as well.

https://serverfault.com/a/1022675/588681

fiona · Jan 30, 2023

Hi,

chrcoluk said:
@fiona

Hi

The confirmed corruption was on the boot files in a windows guest, it happened multiple times after upgrading from proxmox 6.x to 7.x, and I then noticed a new default using io_uring, as soon asI changed it to aio native, the problems stopped and stayed stopped.

I then setup a new windows VM using io_ring purely to test if the problem would come back, this was using different physical drives as well. The boot files got corrupted again.

VM configuration.

6 gig ram no balloon
1 socket 4 cores cpu type EPYC
seabios
virtio gpu 16m
q35 machine
virtio scsi
zvol drive 50g size, ssd=1, discard on, cache=none, throttled to 30000 i/o writes, 500mB/sec writes, the zvol is 64k volblocksize. the pool is a zfs mirror with 2 ssd's. no smart errors, no scrub errors.

proxmox 7.1-12. so needs updating, out of my 3 proxmox, this one I updated to 7.x first, I can upgrade to 7.3 and retest on io_uring.

What kernel were you using at the time the issues appeared?

chrcoluk said:
Found this, I can confirm 'detect-zeroes=unmap' is configured by proxmox on the VM and this is an issue affecting even the latest qemu 7.2 according to those discussions. However not using virtio block device, using virtio scsi instead.

https://gitlab.com/qemu-project/qemu/-/issues/1404

we have not released our version of QEMU 7.2 yet (prior versions are not affected by this bug) and the version we release will already include a fix for it, see here.

King Tiger · Jan 30, 2023

It looks like I ran into a similar issue last weekend. I had to restore 2 VM's from my backup in order to get things working again. Windows 2012 and Windows 2019. Windows 2019 went straight in repair mode.

proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-helper: 7.3-2
pve-kernel-5.15: 7.3-1
pve-kernel-5.4: 6.4-20
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.4.203-1-pve: 5.4.203-1
pve-kernel-5.4.195-1-pve: 5.4.195-1
pve-kernel-5.4.189-2-pve: 5.4.189-2
pve-kernel-5.4.189-1-pve: 5.4.189-1
pve-kernel-5.4.178-1-pve: 5.4.178-1
pve-kernel-5.4.174-2-pve: 5.4.174-2
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.157-1-pve: 5.4.157-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph: 15.2.17-pve1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36+pve2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve3

2019 VM configuration:

BIOS UEFI
Machine: pc-i440fx-5.2
Virtio SCSI Single with 2 disks:
Cache write back
Discard enabled
IO Thread enabled
SSD Emulation enabled
Backup enabled
Async IO: Default (io_uring)

Checked all other VM's; Async IO is set to default (io_uring)

Cluster has been running Proxmox 7.3 for over 2 weeks. When i look at the time the servers reported for the last time my backup was still running.

Syslogs comes back with the following:

Jan 28 05:37:56 BENS-NODE02 kernel: [47204.701852] CIFS: VFS: \\IPadress sends on sock 00000000ce9e3f20 stuck for 15 seconds
Jan 28 05:37:56 BENS-NODE02 kernel: [47204.701936] CIFS: VFS: \\IPadress Error -11 sending data on socket to server
Jan 28 05:37:56 BENS-NODE02 pvestatd[2820]: unable to activate storage 'Synology-NAS' - directory '/mnt/pve/Synology-NAS' does not exist or is unreachable
Jan 28 05:37:56 BENS-NODE02 pvestatd[2820]: status update time (54.554 seconds)
Jan 28 05:38:00 BENS-NODE02 pvestatd[2820]: got timeout
Jan 28 05:38:00 BENS-NODE02 pvestatd[2820]: unable to activate storage 'Synology-NAS' - directory '/mnt/pve/Synology-NAS' does not exist or is unreachable
Jan 28 05:38:10 BENS-NODE02 pvestatd[2820]: got timeout
Jan 28 05:38:10 BENS-NODE02 pvestatd[2820]: unable to activate storage 'Synology-NAS' - directory '/mnt/pve/Synology-NAS' does not exist or is unreachable
Jan 28 05:38:19 BENS-NODE02 pvestatd[2820]: got timeout
Jan 28 05:38:19 BENS-NODE02 pvestatd[2820]: unable to activate storage 'Synology-NAS' - directory '/mnt/pve/Synology-NAS' does not exist or is unreachable
Jan 28 05:38:30 BENS-NODE02 pvestatd[2820]: got timeout

Still its strange that 2 VM's gets coprrupted while 7 others are not?

fiona · Jan 31, 2023

Hi,

King Tiger said:
It looks like I ran into a similar issue last weekend. I had to restore 2 VM's from my backup in order to get things working again. Windows 2012 and Windows 2019. Windows 2019 went straight in repair mode.

do you know how the corruption looked like? Did the VMs stop working right away or during the next boot (lost partition table?)?

King Tiger said:
Jan 28 05:37:56 BENS-NODE02 kernel: [47204.701852] CIFS: VFS: \\IPadress sends on sock 00000000ce9e3f20 stuck for 15 seconds
Jan 28 05:37:56 BENS-NODE02 kernel: [47204.701936] CIFS: VFS: \\IPadress Error -11 sending data on socket to server
Jan 28 05:37:56 BENS-NODE02 pvestatd[2820]: unable to activate storage 'Synology-NAS' - directory '/mnt/pve/Synology-NAS' does not exist or is unreachable

Might've been a network issue/hang. Is Synology-NAS the CIFS mount that errored out? Do your VM's disks reside on that storage? That could be the root cause of the corruption in your case.

King Tiger said:
Still its strange that 2 VM's gets coprrupted while 7 others are not?

Maybe they were in a more consistent (file system) state when the storage disconnect happened so it didn't affect them as badly?

King Tiger · Jan 31, 2023

Good morning Fiona,

My RMM software reported that the 2012 & 2019 server wasnt responding at all. I did a reboot after that it went straight in repair mode.
After starting the 2012 server Windows logo was briefly shown and then the screen went black.

Is Synology-NAS the CIFS mount that errored out? Thats correct.
Do your VM's disks reside on that storage? No i use the Synology NAS to store my backups. Both restores i did from the Synology NAS.
Iam running a 3 node ceph cluster.

Regarding the corruption using Async IO: Default (io_uring). I am seeing several posts mentioning possible corruption because of this setting?

I am gonna test my Synology NAS with my test server running kernel 6.1-2.1. If the does not exist or is unreachable error stays away then i have to upgrade my cluster servers to a newer kernel.

chrcoluk · Feb 1, 2023

The system which had the problem has been updated from proxmox 7.1 to 7.3, and I have created a snapshot, so am prepared to try it again on windows guest with io_uring, and fall back to the snapshot if it breaks.

fiona · Feb 1, 2023

chrcoluk said:
The system which had the problem has been updated from proxmox 7.1 to 7.3, and I have created a snapshot, so am prepared to try it again on windows guest with io_uring, and fall back to the snapshot if it breaks.

If you manage to trigger the issue again, please share the output of pveversion -v and qm config <ID> with the affected VM's ID and the relevant part of the storage configuration (/etc/pve/storage.cfg), i.e. for the storage the VM uses.

chrcoluk · May 28, 2023

Its been a while, but thats because time is needed really for this stuff, I havent yet seen signs of new corruption, and have assigned io_uring now to multiple drives. So either its solved or for whatever reason it went away, will never know I guess, but will report back if its a problem again.

Search

Search

io_uring feedback

chrcoluk

Renowned Member

chrcoluk

Renowned Member

fiona

Proxmox Staff Member

King Tiger

New Member

fiona

Proxmox Staff Member

King Tiger

New Member

chrcoluk

Renowned Member

fiona

Proxmox Staff Member

chrcoluk

Renowned Member