Both servers in HA cluster rebooted for no apparent reason

cdukes

Renowned Member
Sep 11, 2015
89
10
73
Raleigh, NC
www.logzilla.net
Hello,
I tried to migrate a VM this morning and BOTH of our pm servers in a two-node HA rebooted without warning or any apparent reason.
What can I check to find out why this happened?
Server 1:
pve-manager/4.3-9/f7c6f0cd (running kernel: 4.4.21-1-pve)
Server 2:
pve-manager/4.3-9/f7c6f0cd (running kernel: 4.4.21-1-pve)
 
4.x version require 3 Proxmox nodes for HA, so i will start with checking documentation...
 
It's documented that it needs it, but there is also documentation on how to simply set the votes to 2 on one of the servers.
That said, doing a 2-node HA should *definitely never* cause both servers to randomly reboot. That is a serious flaw.
 
If you disable quorum (as you can't have a quorum with just 2 nodes), each node could fence the other.

node1 detect node2 as down and to be sure that it won't came up, it try to power off
node2 does the same with node1.

the result is that both node are powered down due to missing quorum.
 
If you disable quorum (as you can't have a quorum with just 2 nodes), each node could fence the other.

node1 detect node2 as down and to be sure that it won't came up, it try to power off
node2 does the same with node1.

the result is that both node are powered down due to missing quorum.

That's not the problem. NEITHER node was down. All I did was migrate a vm (or attempt to) and they both rebooted.