Kernel 6.8.4-2 causes random server freezing

My experience so far. Pinned the kernel to 6.5.
Then updated one machine, out of four, to 6.8.4-3-pve, machine has AMD Opteron(tm) Processor 6380 (2 Sockets/ 32 Cores). Running now since 37 days without crash. But this machine crashed only twice with 6.8.4-3-pve, within a week or so. The other machines (same Hardware) stalled much more often.

This might help somebody.
 
I
Yes, upgrade to at least proxmox-kernel 6.8.8-1.
the latest pve iso (fully updated) causes those freeze on my ryzen 3900x with x570 chipset. using the 8.1 version stopped the issue. So I will wait for the next release and see.
 
latest kernel Linux 6.8.8-2-pve seems to be OK
one experimental server with this kernel is running for a week and did not freeze
so i will upgrade the rest of them....
 
antonin.chadima did you have any issues with 6.8.8-2 so far?
It is still ok. Some very old VM's like Debian 4, 5, 6... Etc. And similar Ubuntu (Yes we have to use them) have randomly problems (with IO?) and freeze with some kernel panic... Sometimes . Not often....

But it is ok!
 
Same Problem on all of my Genoa hosts, going to downgrade now as other possible causes have been eliminated.
The hosts just die without any sys or kernel buffer logs, not related to high load or anything like that.

The OS doesn't respond to shutdown request or keyboard input.
The affected hosts are running on 6.8.12-1-pve.

I notice buffer/cache occupancy dropping while iowait increases and then the host is gone without logging anything.
I already suspected this to be some kind of fstrim related issues as this occurs every week already which reminded we of the standard systemd-timer config of runnning fstrim once a week with a timer, but this doesn't seem to be related.

My next step would have been to exchange sata boot drives against m.2 nvme but after i saw this thread i will downgrade first.
Thanks to the community for participating that active.
 
I have single-node server with Ryzen 9 5950X on X570D4U mobo and 2 NVMe drives (ZFS Mirror) and I also experience random freezes ever since 6.8 kernel (right now on 6.8.12-4-pve)
It turned out that freezes happens when server is idle even without any VMs running.
Server hangs in 5-30 minutes if I cold-start it with all VMs autostart disabled and no other load, just pinging it from my machine.
It doesn't hang for weeks if I run any continuous load (e.g. at least one core always busy) - doesn't matter if it's on host or in VM. But it'll freeze as soon as there are no load for 5-30 minutes.

Update:
6.5.13-6-pve - freezes on idle
6.8.12-4-pve - freezes on idle
6.11.0-1-pve - freezes on idle
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!