VM's freezing or crashing

oculos

New Member
Jun 12, 2024
22
6
3
Over the last three months using Proxmox (I have four nodes on my cluster), I feel that some VM's crash randomly. There doesn't seem to be a pattern of which VM that crashes - sometimes it's my router (Opnsense), sometimes it's a fedora, sometimes it's an ubuntu machine.

What is it that I can do here? I have obviously to set up some time of watchdog, but am I expected to see this?

Sometimes they crash (I had a kernel panic with OPNSense), but most time it's just that the VM crashes, I try to restart it, can't because of the lock thing, delete the conf file, and then I can restart it again. That has also happened with my OPNSense and with other VM's.

I am using 8.2.2 and 8.2.7 versions, and it seems to happen across my nodes.
 
Hi,
do you see anything inside the host's system log around the time of the crashes? What kind of lock do you mean exactly? Do the crashes happen after some operation like backup/snapshot?
 
Hi @fiona
I'll try and check next time.
The lock I am referring to is that .conf file that prevents me to restart the vm. The gui says the vm is locked because of that file.
I'll give more details when It happens again.
 
@fiona
Another VM crashed, out of the blue.
I don't find anything on the logs, the VM just got frozen.
Do you have any idea about what I could do to monitor this, or which log I should check?
 
I think I would migrate my VM's to another host. It sounds like impending hardware failure which
is usually preceded by odd freezing and crashes. If that isn't possible you might start off by replacing the system memory modules.
 
@fiona
Another VM crashed, out of the blue.
I don't find anything on the logs, the VM just got frozen.
Do you have any idea about what I could do to monitor this, or which log I should check?
Please share the system logs/journal from around the time of the issue and the VM configuration qm config <ID>. What does the command qm status <ID> --verbose say after the freeze?