I'm trying to debug this issue: I have an old Debian VM with kernel 2.6.26 (i386) than has been converted from physical to virtual on a PVE 7.x temporary installation. It worked flawlessly on that PVE until it was moved to a new machine with pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.11-7-pve).
The VM uses 4 GB of RAM, standard vCPU configuration and an LSI 53C895A (again standard) emulated SCSI controller because the VM kernel does not support Virtio-SCSI (it supports the older Virtio controller).
On the new PVE 8.1, the VM hangs with lots of these errors:
BUG: soft lockup - CPU#1 stuck for 96s! [apache2:14807]
BUG: soft lockup - CPU#0 stuck for 96s! [swapper:0]
BUG: soft lockup - CPU#3 stuck for 96s! [rsyslogd:14853]
When it happens, it seems that it's the disk i/o that dies. No more i/o AT ALL. Cannot read, cannot write. Power cycle the VM and it works again fine.
Since it's something that happens with the virtual disk (at least in my opinion) I have tried switching from LSI 53C895A to Virtio (not virtio-scsi) but the issue persists.
The virtual disk is configured as default: no cache, iothread enabled, Async IO is default (io_uring).
I/O on the host seems to be fine, no errors on the host log, the other VMs (modern Debian) work just fine. The disks are 2 NVME disks in raid1 (mdadm) with LVM on top
The issues are not clearly related to anything (not to backups, for example). I/O on the whole host is quite low, iodelay is under 5% even at the (rare) peaks, and it's usually under 0,5%.
Any ideas?
Thanks a lot.
The VM uses 4 GB of RAM, standard vCPU configuration and an LSI 53C895A (again standard) emulated SCSI controller because the VM kernel does not support Virtio-SCSI (it supports the older Virtio controller).
On the new PVE 8.1, the VM hangs with lots of these errors:
BUG: soft lockup - CPU#1 stuck for 96s! [apache2:14807]
BUG: soft lockup - CPU#0 stuck for 96s! [swapper:0]
BUG: soft lockup - CPU#3 stuck for 96s! [rsyslogd:14853]
When it happens, it seems that it's the disk i/o that dies. No more i/o AT ALL. Cannot read, cannot write. Power cycle the VM and it works again fine.
Since it's something that happens with the virtual disk (at least in my opinion) I have tried switching from LSI 53C895A to Virtio (not virtio-scsi) but the issue persists.
The virtual disk is configured as default: no cache, iothread enabled, Async IO is default (io_uring).
I/O on the host seems to be fine, no errors on the host log, the other VMs (modern Debian) work just fine. The disks are 2 NVME disks in raid1 (mdadm) with LVM on top
The issues are not clearly related to anything (not to backups, for example). I/O on the whole host is quite low, iodelay is under 5% even at the (rare) peaks, and it's usually under 0,5%.
Any ideas?
Thanks a lot.