I am still new to Proxmox, but have been trying it out on my home network for my very simplistic needs. For the most part, it has worked very well for running the following three VMs:
Over the past week, however, I have now seen my OPNSense VM freeze/hang entirely twice (middle of the day, while working from home), and my "internal" Linux VM freeze once (in the middle of the night when there was no activity whatsoever). In all cases, Proxmox itself is still up and running with no issues, but I cannot shutdown or reboot the VMs, and the console is unresponsive for them. My only recourse seems to be rebooting the Proxmox server itself. Things I've done thus far:
PS: Since this seems to be a common request, here's the output of pveversion -v:
- OPNSense (FreeBSD) for all my routing/firewall needs
- An "internal" VM server running Ubuntu 22.04
- An "external" VM server also running Ubuntu 22.04
Over the past week, however, I have now seen my OPNSense VM freeze/hang entirely twice (middle of the day, while working from home), and my "internal" Linux VM freeze once (in the middle of the night when there was no activity whatsoever). In all cases, Proxmox itself is still up and running with no issues, but I cannot shutdown or reboot the VMs, and the console is unresponsive for them. My only recourse seems to be rebooting the Proxmox server itself. Things I've done thus far:
- I saw some threads (like this one) where people were experiencing some similar things when migrating VMs from one node to another, but that's definitely not what I'm doing. Still, I followed the recommendation on this bug to roll back to the 5.13 kernel, but the problem has persisted.
- I have also seen some mentions about not over-allocating CPU cores, which I was originally doing (passing all 4 cores through as kvm64 to all three VMs), but CPU usage doesn't seem to correlate with when it's failing, and never really gets much beyond 30% for the entire node the VMs sit on. I did go ahead and try backing my OPNSense VM down to two cores, and the other two down to just 1 core each to see if that makes any difference over the next few days.
- I checked the Syslog on the node, and there are no entries at all around the time my OPNSense VM failed today.
- No logs around the time of failure within the OPNSense VM itself either.
- The screenshot attached is what my OPNSense VM's console showed at the time it failed today. I couldn't scroll up or really do anything to see more than this, and none of what it says means anything to me, but maybe it will to someone.
PS: Since this seems to be a common request, here's the output of pveversion -v:
Bash:
root@home:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-8
pve-kernel-helper: 7.2-8
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.39-3-pve: 5.15.39-3
pve-kernel-5.15.39-2-pve: 5.15.39-2
pve-kernel-5.15.30-2-pve: 5.15.30-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-7
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1