Hello Everybody,
not sure if this thread (https://forum.proxmox.com/threads/kernel-panic-since-update-to-proxmox-7-1.101164/#post-437435) is related but since we updated to PVE7.2 we have repeated crashes with kernel message
following multiple
leading to a stuck VM (pid 9815 in this case). We can not do
anymore, it gets stuck just before displaying PID 9815.
Once we kill this process hard with kill -9 the behaviour of ps stays the same but we still have a kworker/36:3+events process running with high load.
Ultimately we have to hard (!!) reset the machine, since a shutdown never manages to kill the blocked process.
Kernel Version is 5.15.35-1-pve, PVE Version is 7.2-3. The System is a pure KVM compute machine servered by CEPH from seperate CEPH nodes but the stuck VM (same as last time) has its storage on a local NVMe Raid (via a Dell Perc H755 NVME Controller), if this matters.
not sure if this thread (https://forum.proxmox.com/threads/kernel-panic-since-update-to-proxmox-7-1.101164/#post-437435) is related but since we updated to PVE7.2 we have repeated crashes with kernel message
Code:
INFO: task khugepaged:796 blocked for more than 120 seconds.
following multiple
Code:
INFO: task iou-wrk-9815:650094 blocked for more than 120 seconds.
leading to a stuck VM (pid 9815 in this case). We can not do
Bash:
ps ax
Once we kill this process hard with kill -9 the behaviour of ps stays the same but we still have a kworker/36:3+events process running with high load.
Ultimately we have to hard (!!) reset the machine, since a shutdown never manages to kill the blocked process.
Kernel Version is 5.15.35-1-pve, PVE Version is 7.2-3. The System is a pure KVM compute machine servered by CEPH from seperate CEPH nodes but the stuck VM (same as last time) has its storage on a local NVMe Raid (via a Dell Perc H755 NVME Controller), if this matters.
Last edited: