Proxmox freezes randomly.

rutrap

New Member
Nov 26, 2024
4
0
1
Hi, I am a teacher and I work at a school. In the computer lab, I set up a server for system restoration using FOG and a file server using OpenMediaVault. Until now, they were running on separate machines. I decided to virtualize them using Proxmox. I used a computer with an MSI x370 gaming pro carbon motherboard, an AMD Ryzen 7 1700X processor, and 48GB of RAM (2x16GB + 2x8GB). I started two virtual machines, FOG and OMV (16GB RAM, 2 cores), configured them, and everything seemed to work, but from the very beginning, Proxmox occasionally "freezes." Nothing can be done, the web GUI and SSH do not work. I have to reset it. On the monitor connected to Proxmox, the following error appears:

root@pve25: # [11218.538901] igb 0000:21:00.0"eno1: NETDEV WATCHDOG: CPU: 11: transmit queue 1 timed out 5455 ms
[11260.804462] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:153]
[11256.290168] nmi_backtrace_stall_check: CPU 0: NMIs are not reaching exc_nmi() handler, last activity: 941852 jiffies ago.
[11267.168444] watchdog: Watchdog detected hard LOCKUP on cpu 11
[11266.294210] nmi_backtrace_stall_check: CPU 4: NMIs are not reaching exc_nmi() handler, last activity: 1834999 jiffies ago.
[11286.302450] nmi_backtrace_stall_check: CPU 8: NMIs are not reaching
[11276.298415] nmi_backtrace_stall_check: CPU 6: NMIs are not reaching exc_nmi() handler, last activity: 936681 jiffies ago.

Both virtual machines do not even exceed half of the CPU and memory resources. I updated the BIOS on the motherboard, disabled the Ballooning Device option for memory, but it didn't help. I removed 2x8GB RAM, and still nothing. Recently, another error appeared, after which I also couldn't do anything but reset:

root@pve25: # [ 9734.619739] nmi_backtrace_stall_check: CPU 0: NMIs are not reaching exc_nmi() handler, last activity: 1557393 jiffies ago.
[10304.630894] nmi_backtrace_stall_check: CPU 0: NMIs are not reaching exc_nmi() handler, last activity: 2127422 jiffies ago.

I tried almost all the solutions provided on this forum for similar cases, but none of them helped.

What else can I do? Could it be hardware issues?
 
NMI and freezes does sound like a hardware issue. Proxmox ran/runs fine with a 2700X on a X470 motherboard with 4 DIMMS.
The original Zen1 CPUs had issues with C-states; maybe disable those in the BIOS? Also update to the latest (or most stable) BIOS version for your motherboard.
The original 300 series motherboard were also not the highest quality as Ryzen was still in a value-oriented phase. Make sure not to stress or overclock it and maybe try with PBO disabled to put less stress on the VRMs?
 
Ok, I disabled C-State and PBO in the BIOS and will test it. I'll let you know if it helped. Thanks for now.
 
I had these issues with a first and to a lesser degree second gen Ryzen. Fiddling with C-states and different kernels for the guest OS eventually solved it. Most of the issues I had was caused by a VM running with the host CPU option. Certain Linux kernels caused this problem and using different ones for the guest OS eventually solved the issue. Technically the failing kernels were newer and should have supported all Ryzen C-state features and I never quiet figured out if the issue was related to a bad implementation in the kernel or some kind of conflict caused by the Proxmox kernel and the VM's kernel.

TLDR: Disable C-states in BIOS and don't set the CPU to host unless you have to. Be aware that C-state issues can be caused directly by the kernel of the guest OS if you use CPU type host.
 
Then please try 6.11, and if that does not improve the situation, 6.5.
 
So far, I have disabled C-State and PBO in the BIOS as leesteken suggested, and it seems to be working. I left the server running overnight, and when I returned to work, the server with the virtual machine was still running and continues to work. Thank you all for your help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!