Proxmox Keeps Freezing every 2-3 days

Flock6659

New Member
Jan 15, 2026
3
0
1
Hi everyone,

I am running Proxmox VE 9.1.1 on a Beelink EQ14 (Intel Twin Lake N150, 16GB RAM).

I am experiencing random hard freezes every 2-3 days. The system becomes completely unresponsive and I'm looking for advice on how to debug this further, as the logs provide no clues.

**The Symptoms:**
- The system runs stable for about 48-72 hours, then hangs completely.
- Network is dead (cannot ping the host or any VMs).
- The physical network port LEDs show a blinking orange light, but the green light is off.
- HDMI output is completely black (no signal/console).
- Physical power button is unresponsive (short and long press does nothing).
- The only way to recover is to pull the power plug and reconnect.

**Diagnostics attempted:**
- Checked Syslog/Journalctl: There are NO errors or kernel panics prior to the freeze. The logs simply cut off at a random timestamp and resume after the hard reboot.
- Ran Memtest86+ and disk checks: All passed without errors.
- I downgraded the kernel to 6.14, same issue.

**Hardware:**
- Model: Beelink EQ14
- CPU: Intel N150 (Twin Lake)
- Kernel: Default PVE 9.1 kernel
- Storage: NVMe SSD

Has anyone experienced similar stability issues with the N150/EQ14 platform on PVE 9?

Any suggestions are welcome!
 
Thanks! Morning! I have 2 VMs, configured to use about 13gb of the 16 available, so I guess 3gb available for proxmox. Utilization cpu is around 5% on average.
 
Additionally, you mention the logs do not show anything. Which Logs are you referencing?

If you have not already, could you check [journalctl -p err..emerg] for critical errors? You can also include options for time ranges too: journalctl --since "2015-01-10 17:15:00"
We could also look in cat /var/log/syslog for errors as well. Grep for keywords like power, ACPI, kernel panic, error, failed, critical
 
Any chance you can try leaving the system for that time cycle with one or both of the VM's off? Wondering if perhaps there is a hardware or power issue with the system.
In the first place I had only one running, then I was facing this issue too. I recently added a second VM. I can try to stop all VMs, but would need to perform a migration to another system first as my home automation is depending on it :) and can't bring it down for so long.

Additionally, you mention the logs do not show anything. Which Logs are you referencing?

If you have not already, could you check [journalctl -p err..emerg] for critical errors? You can also include options for time ranges too: journalctl --since "2015-01-10 17:15:00"
We could also look in cat /var/log/syslog for errors as well. Grep for keywords like power, ACPI, kernel panic, error, failed, critical
I did
Code:
journalctl -b -1 --no-pager | tail -n 300

and

Code:
journalctl -b -1 --no-pager | egrep -i "oom|out of memory|hung|watchdog|nvme|i/o error|timeout|reset|pcie|igc|e1000|r8169|link down|link up|call trace|BUG" | tail -n 200

It didn't show any failure/error or anything, last logs before it hangs is different from time to time. I.e. last time it was a log rotation.

My system doesn't seem to have any syslog
 
It sounds like it is locking up from aggressive power control/C-state error. I would check for a new bios for that machine and I would update the microcode. To do that you can enable the non-free-firmware option in the Debian repos (deb http://deb.debian.org/debian trixie main contrib non-free-firmware) and run

Code:
apt update
apt install intel-microcode

If those don't help you can go into bios and disable C-states / deep sleep states, package C-state limit, and ASPM (if present). Likewise in GRUB you can turn off C-states and ASPM if not in the bios. You would Edit `/etc/default/grub` and add add "intel_idle.max_cstate=1 pcie_aspm=off" to the existing GRUB_CMDLINE_LINUX_DEFAULT line, save and exit, then apply

Code:
update-grub
reboot

Then run for 72+ hours. If it stays stable, re-enable those options one at a time to isolate the culprit.