I was running PVE on kernel 6.8 for the longest time with no issues. I updated to PVE9.0 back in August and immediately had issues with the VM that does PCI passthrough. I posted about it
here and fiona responded
here saying kernel changes might be needed. At that time that was kernel 6.14. I pinned back the 6.8 kernel and have been running fine since then.
I wanted to try this new kernel 6.17 in the hopes that it would fix my issue, but unfortunately it did not. The behavior has changed slightly, where the VM will think it's running (100% RAM, 100% of a single core, so 25% CPU if I allocate 4 cores, 50% CPU if I allocate one core, 100% CPU if I allocate one core) but it won't actually boot. I also can't attach the console, it will always fail with
VM 102 qmp command 'set_password' failed - unable to connect to VM 102 qmp socket - timeout after 50 retries. The issue goes away when I hard-stop the VM, remove the PCI passthrough device, and boot the VM. The issue goes away if I use the 6.8 kernel with
proxmox-boot-tool kernel pin 6.8.12-13-pve and then reboot
I am currently attempting to use 6.17.4-1-pve using the latest version of PVE
I've attached the same logs as last time: VM configuration, GDB output, PVE package versions, ps faxl output, the last hour of PVE server log.
This is an Odroid H4+ with a 12th Gen Intel N97 CPU, if that helps.
I'm looking to figure out if this is still an issue with the kernel, or if maybe there's a setting I can apply that will make this work.
edit: There's also a post on the bug tracker
https://bugzilla.proxmox.com/show_bug.cgi?id=7176