Kernel Panic, whole server crashes about every day

Can you please open a new thread for that, the current one is rather for io_uring and an bug we actually could manage to reproduce after a lot of testing and also fixed together with upstream kernel devs.

pvesr itself does not uses io_uring at all currently, so this seems rather unrelated to this thread and possibly an issue with ZFS on your system.

Would be also great if you could add some details about the system (CPU/motherboard) and the ZFS setup in the new thread, thanks!

Can you please open a new thread for that, the current one is rather for io_uring and an bug we actually could manage to reproduce after a lot of testing and also fixed together with upstream kernel devs.

pvesr itself does not uses io_uring at all currently, so this seems rather unrelated to this thread and possibly an issue with ZFS on your system.

Would be also great if you could add some details about the system (CPU/motherboard) and the ZFS setup in the new thread, thanks!

Moved to https://forum.proxmox.com/threads/p...ion-triggers-kernel-crash-on-proxmox-7.95075/
 
  • Like
Reactions: t.lamprecht
@t.lamprecht , as I understood the bug is fixed, right? In which kernel version the fix was implemented?
Yes, the io_uring related one, which this thread was initially about, is fixed. See:
FYI, there's a newer kernel as package pve-kernel-5.11.22-3-pve version 5.11.22-6 which solves an issue with some unexpected EAGAIN's that the io_uring kernel code got from some subsystems softirq code paths.
 
Hi all for me it's not resolve. Got again 2/3 days crash since kernel update.
Going to try to mount a stack trace listener to post it here.
 
  • Like
Reactions: flames
Hi all for me it's not resolve. Got again 2/3 days crash since kernel update.
Going to try to mount a stack trace listener to post it here.
It ends up with a new SSD for rpool/boot.
The old one (new from 12/2020) show smart error at boot and only allow me 2 backups before definitely dying.
 
hey, not my achievement, just took my time to search the webz.
if you can afford to have another crashes for sake of testing, please, set your bios first to defaults and then _only_ set following options (or equivalents for your bios)...:
SVM = enable (virtualization aka vt-d in intel world)
IOMMU = enable (default = auto on most x570)
Power idle control = Typical current idle (cstate 6 disabled on some x570)

do not change anything else (no need to disable cstates completely, or "amd cool&quiet" or something. also no need to set B2 stepping... just let everything else default.
is it stable? would appreciate your info. thanks.
Still applicable now, been struggling this problem (even with latest edge kernel), you save my day, thank you..

Base Board Information
Manufacturer: Micro-Star International Co., Ltd
Product Name: B450 TOMAHAWK MAX II (MS-7C02)

CPU(s)

12 x AMD Ryzen 5 2600 Six-Core Processor (1 Socket)
Kernel Version

Linux 5.17.4-edge #1 SMP PREEMPT PVE Edge 5.17.4-1 (2022-04-20)
PVE Manager Version

pve-manager/7.1-12/b3c09de3
 
Last edited:
  • Like
Reactions: flames
My nodes with Ryzen 5800x, 5900x and 5950x crashed all the time (at least once per day, with or without load, even w/o a single VM/CT on them) with PVE 6.4 both kernels 5.4 and 5.11 and PVE 7.0. after disabled C6 state, all of them are now running rock solid since weeks.
The troubleshooting was hard. Since its "desktop" hardware, i first thought the non ECC RAM was the issues, bought ECC unbuffered, still crashes. Then researches lead to something with kernel + AMD hardware, updated from 5.4 to 5.11, still crashes, PVE 7.0 came just out, updates, still crashes. then i was sitting there and tested tons of bios settings, but it took a lot of time, because the crash was sporadical. Then found on a Linux forum a post from last year, where C6 state was mentioned with Buster + Ryzen 3600 ... tried out, and bam!
Try it out, i would be very interested in your results.
I experienced the same this week on v7.4
Turning of C6 worked like a charm....after 2 days of trying basically everything else.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!