VM in error state - how to troubleshoot

vernimmen

Member
Apr 15, 2020
11
0
6
45
Hi,

I'm running a proxmox 5.4 cluster with 18 nodes.
On multiple occasions I've had VMs show up in 'HA error' state. To fix this I set the VMs to ignored state, waited a minute for the error to clear, and then started them. They then start fine, without any problems.
The VM is member of an HA group with these settings:
- restricted: yes
- selected nodes: pve-04
- priority: 1
- all other nodes are not selected and do not have a priority.
This is intentional because the VM is using local storage and should not move or be moved when the hypervisor goes down or is rebooted.

This error state has happened a few times now and I wish to figure out why this is happening and how to fix it. But I can't find any information about the VM entering the HA error state in any logs. Could someone please point me to the log in which I can see that the VM entered an error state? And possibly a log that shows why it went into that state?

Thank you for any help you can provide,


Max
 
Hi,

the error state occurs if a service can not migrate and so HA-manger can not start the service.
A reboot that is not so fast or shutdown will do this.
Because you restrict the service to the node.
What you can do to fix this is set ha setting in the datacenter.cfg [1] to "freeze"
But this setting is cluster-wide.

1.)https://pve.proxmox.com/wiki/Manual:_datacenter.cfg
 
Hi Wolfgang,

Thank you for your insights. If I understand correctly what you wrote, proxmox will set VMs into error state if it cannot migrate the VMs. Even if the HA policy says the VM should not be migrated. Is that correct?

Besides the freeze setting, is there a different way to tell proxmox that this specific VM does not need to be migrated, and can be started once the hypervisor comes back online?
Otherwise, perhaps we can work around this by first shutting down the VM before shutting down the hypervisor?

thank you!
 
Even if the HA policy says the VM should not be migrated. Is that correct?
Correct.

Yes, you can turn off the VM before shutdown and the problem is also solved.
But must start it after the reboot again.
 
Thank you Wolfgang, I understand now.
Would it be something that proxmox could consider to add as a feature, where it's possible to set a VM as 'local' (not need to be migrated when the HV goes down) without it showing up in error state after a hypervisor reboot?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!