2 Identical nodes, 1 stable and 1 kernel panic every day.

lightmaster

New Member
Feb 8, 2023
2
0
1
I've got 2 identical nodes running Proxmox (with another device acting as a qdevice). The only difference between the 2 nodes is that one has more memory than the other. 1 of them has been rock stable since I first installed Proxmox 7 on it, and the second can't go more than 24 hours without having a kernel panic, sometimes as little as an hour after boot and it crashes. I have tried swapping out for a known good set of RAM sticks, and I've tried running the computer's built-in diagnostic suite (RAM, CPU, SSD tests). I've also installed Pop_OS and ran that without it crashing for roughly a week. After a crash, I've looked in syslog and messages, and don't see anything that stands out before the crashes. I do have a picture of what little information is shown on the screen while it's in a kernel panic state and waiting to be powered off.

Is this picture enough to figure out what's causing the constant crashing? If not, what can I do to narrow down the cause?


20230217-060034.jpg
 
Could be a clock/timer issue or a powerstate/idle problem (or something else completely). Are both systems on the same/latest BIOS version? Do they have identical BIOS settings (C-states etc.)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!