Help Troubleshooting Frozen/Crashed VMs

jarodmerle

New Member
Aug 16, 2022
4
0
1
I am still new to Proxmox, but have been trying it out on my home network for my very simplistic needs. For the most part, it has worked very well for running the following three VMs:

  1. OPNSense (FreeBSD) for all my routing/firewall needs
  2. An "internal" VM server running Ubuntu 22.04
  3. An "external" VM server also running Ubuntu 22.04

Over the past week, however, I have now seen my OPNSense VM freeze/hang entirely twice (middle of the day, while working from home), and my "internal" Linux VM freeze once (in the middle of the night when there was no activity whatsoever). In all cases, Proxmox itself is still up and running with no issues, but I cannot shutdown or reboot the VMs, and the console is unresponsive for them. My only recourse seems to be rebooting the Proxmox server itself. Things I've done thus far:

  • I saw some threads (like this one) where people were experiencing some similar things when migrating VMs from one node to another, but that's definitely not what I'm doing. Still, I followed the recommendation on this bug to roll back to the 5.13 kernel, but the problem has persisted.
  • I have also seen some mentions about not over-allocating CPU cores, which I was originally doing (passing all 4 cores through as kvm64 to all three VMs), but CPU usage doesn't seem to correlate with when it's failing, and never really gets much beyond 30% for the entire node the VMs sit on. I did go ahead and try backing my OPNSense VM down to two cores, and the other two down to just 1 core each to see if that makes any difference over the next few days.
  • I checked the Syslog on the node, and there are no entries at all around the time my OPNSense VM failed today.
  • No logs around the time of failure within the OPNSense VM itself either.
  • The screenshot attached is what my OPNSense VM's console showed at the time it failed today. I couldn't scroll up or really do anything to see more than this, and none of what it says means anything to me, but maybe it will to someone.
I guess what I'm looking for more than anything is just advice on ways to further troubleshoot the issue (assuming backing down the cores assigned doesn't fix it). I'm enjoying tinkering with Proxmox a lot, and really hoped it would be stable enough to host a (mostly) full-time router/firewall, but maybe it just fundamentally disagrees with the very basic hardware I have it running on too (this micro firewall appliance with a Celeron N5105, with 32 GB of no-name RAM, and a decently nice 500 GB NVMe SSD)?

PS: Since this seems to be a common request, here's the output of pveversion -v:

Bash:
root@home:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-8
pve-kernel-helper: 7.2-8
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.39-2-pve: 5.15.39-2
pve-kernel-5.15.30-2-pve: 5.15.30-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-7
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
 

Attachments

  • 2022-08-15 12_04_59-home - Proxmox Virtual Environment.png
    2022-08-15 12_04_59-home - Proxmox Virtual Environment.png
    129.4 KB · Views: 6
Well, at least I'm not the only one then :). As many different searches as I've done, I can't believe I never stumbled on it, but maybe it's new enough that Google searches aren't picking it up either. Thank you for pointing this out!

This does remind me that the first time I saw it happen the console for my OPNSense VM did show a, seemingly endless, stream of output that seemed to indicate a kernel panic. I guess I will hang tight for a while and hope some solution gets proposed.
 
I have also seen some mentions about not over-allocating CPU cores, which I was originally doing (passing all 4 cores through as kvm64 to all three VMs), but CPU usage doesn't seem to correlate with when it's failing, and never really gets much beyond 30% for the entire node the VMs sit on. I did go ahead and try backing my OPNSense VM down to two cores, and the other two down to just 1 core each to see if that makes any difference over the next few days.
CPU overprovisioning isn't a problem as long as the guests are most of the time idling and you don`t go to far with it. For example, 90+ vCPUs are working fine here with usually just 7% CPU utilization on a 16 core CPU.
I'm enjoying tinkering with Proxmox a lot, and really hoped it would be stable enough to host a (mostly) full-time router/firewall, but maybe it just fundamentally disagrees with the very basic hardware I have it running on too (this micro firewall appliance with a Celeron N5105, with 32 GB of no-name RAM, and a decently nice 500 GB NVMe SSD)?
The last days alot of people are complaining about crashing VMs. They think its a problem with the N5105 CPU. See for example here: https://forum.proxmox.com/threads/vm-freezes-irregularly.111494/
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!