HA-Status in error state after backup

Knuuut · Feb 12, 2019

Hello Community,

after backing up (proxmox backup function) about 120VMs in a 4-Node-Cluster via NFS, a few (~20)VMs shows the HA in error state. The affected VMs still running fine and there was no trouble at all while running the backup. All VMs are managed by HA.

Here are the notifications within the syslog, exemplary of 1 affected VM:

Code:

Feb 11 22:28:23 pmve-1 pve-ha-lrm[2989951]: VM 252 qmp command failed - VM 252 qmp command 'query-status' failed - got timeout
Feb 11 22:28:23 pmve-1 pve-ha-lrm[2989951]: VM 252 qmp command 'query-status' failed - got timeout
Feb 11 22:28:39 pmve-1 pve-ha-lrm[2990105]: service vm:252 is in an error state and needs manual intervention. Look up 'ERROR RECOVERY' in the documentation.

After leaving the HA-Group and re-joining the error was gone.

All VMs have got a running qemu-agent.

PMVE-Version 5.1-36

How we can avoid getting into HA-error state?

Cheers Knuuut

Knuuut · Feb 13, 2019

No suggestions? Even a hint where to start may be useful.
Today we've got almost the same situation, but with less VMs in error state.

There is also a thread with the same error from someone else in the german forum:

https://forum.proxmox.com/threads/backup-ha-error.51543

Cheers Knuuut

tom · Feb 13, 2019

Knuuut said:
No suggestions? Even a hint where to start may be useful.

Test/Use latest version only.

https://pve.proxmox.com/wiki/Downlo...Proxmox_Virtual_Environment_5.x_to_latest_5.3

Knuuut · Feb 13, 2019

tom said:
Test/Use latest version only.

I don't think this is useful.

Another user, same problem, latest version:

jms1000 said:
ich habe einen Cluster mit 4 Proxmox-Servern, je aktuelle und gleiche Version

Is there a possibility to adjust timeout values for HA?

tom · Feb 13, 2019

Knuuut said:
I don't think this is useful.

We support only current version as it makes no sense to hunt already fixed issues.

So it is useful to use current version, especially HA stack got many improvements recently.

Search

Search

HA-Status in error state after backup

Knuuut

Member

Knuuut

Member

tom

Proxmox Staff Member

Knuuut

Member

tom

Proxmox Staff Member

We value your privacy