Proxmox freezes when CPU under low load condtions

simon098

New Member
Jan 8, 2025
3
0
1
Hi,

Been having this issue for a while now. When the CPU is under low loads (say less than 1-2%) Proxmox will freeze randomly. The journalctl doesn't seem to provide anything meaningful either.

I can run this for days without issue if I keep the CPU at a high idle, say >3%. OpnSense is usually running at all times, although it doesn't use enough of the CPU to bring the load up.


I don't think it's a hardware issue like RAM or PSU, as this would show more when under higher loads

CPU: Ryzen 9 6900HX
Memory: 32GB DDR5
Storage: SSD (Proxmox) + nVME (VM Storage)

Proxmox Version: 8.3.2
Kernel Version: Linux 6.8.12-5-pve

Here is the latest crash journctl logs

Code:
Jan 08 01:17:01 pve CRON[32326]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 08 01:17:01 pve CRON[32327]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Jan 08 01:17:01 pve CRON[32326]: pam_unix(cron:session): session closed for user root
Jan 08 01:25:15 pve pvedaemon[1350]: <root@pam> successful auth for user 'simon@pve'
Jan 08 01:27:55 pve pveproxy[23068]: worker exit
Jan 08 01:27:55 pve pveproxy[1357]: worker 23068 finished
Jan 08 01:27:55 pve pveproxy[1357]: starting 1 worker(s)
Jan 08 01:27:55 pve pveproxy[1357]: worker 35146 started
Jan 08 01:40:15 pve pvedaemon[1349]: <root@pam> successful auth for user 'simon@pve'
Jan 08 01:49:44 pve smartd[924]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 71 to 72
Jan 08 01:50:11 pve pveproxy[21244]: worker exit
Jan 08 01:50:11 pve pveproxy[1357]: worker 21244 finished
Jan 08 01:50:11 pve pveproxy[1357]: starting 1 worker(s)
Jan 08 01:50:11 pve pveproxy[1357]: worker 40956 started
Jan 08 01:55:16 pve pvedaemon[1350]: <root@pam> successful auth for user 'simon@pve'
Jan 08 02:02:21 pve systemd[1]: Starting pve-daily-update.service - Daily PVE download activities...
Jan 08 02:02:22 pve pveupdate[44121]: <root@pam> starting task UPID:pve:0000AC5E:000EE6B9:677DDCAE:aptupdate::root@pam:
Jan 08 02:02:23 pve pveupdate[44126]: update new package list: /var/lib/pve-manager/pkgupdates
Jan 08 02:02:24 pve pveupdate[44121]: <root@pam> end task UPID:pve:0000AC5E:000EE6B9:677DDCAE:aptupdate::root@pam: OK
Jan 08 02:02:24 pve systemd[1]: pve-daily-update.service: Deactivated successfully.
Jan 08 02:02:24 pve systemd[1]: Finished pve-daily-update.service - Daily PVE download activities.
Jan 08 02:02:24 pve systemd[1]: pve-daily-update.service: Consumed 2.261s CPU time.
Jan 08 02:10:47 pve pvedaemon[1351]: <root@pam> successful auth for user 'simon@pve'
Jan 08 02:13:26 pve pveproxy[25464]: worker exit
Jan 08 02:13:26 pve pveproxy[1357]: worker 25464 finished
Jan 08 02:13:26 pve pveproxy[1357]: starting 1 worker(s)
Jan 08 02:13:26 pve pveproxy[1357]: worker 47405 started
Jan 08 02:17:01 pve CRON[48323]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 08 02:17:01 pve CRON[48324]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Jan 08 02:17:01 pve CRON[48323]: pam_unix(cron:session): session closed for user root
-- Boot 2e8e62781a5d4069a5670f524c44ad2d --
Jan 08 09:32:18 pve kernel: Linux version 6.8.12-5-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-5 (2024-12-03T10:26Z) ()
Jan 08 09:32:18 pve kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-5-pve root=/dev/mapper/pve-root ro quiet


Anyone with any ideas? I was thinking maybe it's to do with C-States or something, but I can't see anything about them being enabled in the BIOS, although it's pretty limited on what it shows in the BIOS for this machine...
 
So far so good. I'd normally leave it 4-5 hrs and it would have frozen. It's been running a bit more than 24hrs so far and not frozen up.

I did find an option to disable Global C states in the BIOS. I disabled this too. But I will leave this a few days and see if it crashes, if it doesn't I'll re-enable it in BIOS and see if the GRUB process.max_cstate=1 works. :cool:

P.S. Also, interestingly... since disabling the C-States, not only does it seem more reliable, it also seems to be consuming less power!
 
Last edited:
C-States on AMD hardware should be a sticky - it is flat out broken on Linux and AMD and their board manufacturers refuse to properly fix it and point the finger. It is broken on Windows as well, although in many cases, Microsoft has forced the hand of AMD to ship a fix for them through Windows Update.

There may be some fixes but it needs a coordinated fix between your CPU and motherboard firmware which is not always available. AMD just doesn’t care about Linux. On servers, yeah, but desktop, no.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!