[SOLVED] Strange Proxmox crash/freeze

Zwickor

New Member
Dec 17, 2024
3
0
1
Hi I am running Proxmox on a HP Elitedesk G5. Sadly I have some issues. After a few hours of use, randomly, I cannot access the web GUI anymore. Sadly SSH access wont work as well! So it seems Proxmox crashed. Proxmox is still pingable! Running VMs are still accessible and pingable! A few minutes later, the Windows 11 VM "crashed" as well, the other LXC are still working.

Logs dont show anything, since after Proxmox is not accessible anymore, nothing is written in the logs!

I dont know what to do. Any ideas? I was running Proxmox on a M720q before without any problems. OFC I setup everything new. No errors in the logs, before the crashes or on boot. Any ideas?
 
Hi,
sounds like your HP Elitedesk G5 is having an issue. Refurbished hardware?

Sadly SSH access wont work as well!
What is your network setup?
cat /etc/network/interfaces

I dont know what to do.

connect with a local keyboard and screen, and log in manually and check your network stack.
Check why SSHD is not available anymore.
Check your network stack.
Restart your network stack.
 
Code:
auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.6.15/24
        gateway 192.168.6.1
        bridge-ports eno1
        bridge-stp off
        bridge-fd 0


source /etc/network/interfaces.d/*

Checking with local screen is a good idea, try to do that the next "freeze" - Something that concerns me is, that nothing is written in the logfiles.

And yes refurbished hardware
 
Last edited:
Something that concerns me is, that nothing is written in the logfiles.
Can you watch journalctl -f -p warning before and during the freeze? Use a separate window (or a screen/tmux-session via ssh) for this.
 
After further investigation, I discovered that the system goes into "Read Only" mode. This also explains why no more logs are being written and why the VMs are gradually crashing. Upon further research, I came across THIS thread. Setting ASPM=off solves the problem but causes the idle power consumption to increase exorbitantly. I have now replaced the NVMe drive, and everything is functioning as desired.

Probably, there are sometimes issues with a specific combination of Proxmox, motherboard/BIOS, and the SSD being used. The SSD itself is not defective. In the previous system with Proxmox, it ran absolutely without any problems.
 
Last edited: