Hi All,
Today when our three node cluster restarted cold. I received an email from ait3, one of the nodes.
All of the VMs on AIT3 we're in fenced state.
HA in the GUI under "datacenter" looked like the one in this thread: https://forum.proxmox.com/threads/master-old-timestamp-dead.26489/ with ait being the node marked "old timestamp dead?".
To resolve this I made sure that the cluster was up, that quorum was present and then I removed all of the VMs from HA. Once this was done AIT3 seemed to "un-fence" itself. I could then add all of the VMs back to the HA list ... as well as start / stop them ... something that I couldn't do before.
I did this after a lot of searching on the forums.
My question is, why did this happen? Also, is there a better way to resolve the issue. I suspect that I could have removed only the VMs marked "fence" from HA?
best,
James
Today when our three node cluster restarted cold. I received an email from ait3, one of the nodes.
Code:
The node 'ait3' failed and needs manual intervention.
The PVE HA manager tries to fence it and recover the
configured HA resources to a healthy node if possible.
Current fence status: FENCE
Try to fence node 'ait3'
All of the VMs on AIT3 we're in fenced state.
HA in the GUI under "datacenter" looked like the one in this thread: https://forum.proxmox.com/threads/master-old-timestamp-dead.26489/ with ait being the node marked "old timestamp dead?".
To resolve this I made sure that the cluster was up, that quorum was present and then I removed all of the VMs from HA. Once this was done AIT3 seemed to "un-fence" itself. I could then add all of the VMs back to the HA list ... as well as start / stop them ... something that I couldn't do before.
I did this after a lot of searching on the forums.
My question is, why did this happen? Also, is there a better way to resolve the issue. I suspect that I could have removed only the VMs marked "fence" from HA?
best,
James