How to force quorum if the 3rd monitor is down

lifeboy

Renowned Member
I have a situation where a node failed (due to the boot drive failing) and then another node failed (due to RAM failure). There are 7 nodes in the cluster, so things kept running, but eventually there were many writes that could not be redundantly stored and the whole thing ground to a halt.

After sourcing some drives and RAM, I installed it and restarted the cluster. However, one of the nodes (that hosts the 3rd monitor) doesn't want to join the cluster. There are 3 monitors that are available, but no quorum.

1684426808484.png

mon.s5 is down because node s5 is down.

In practice, any ceph command times out because there is not quorum and thus the cluster is down.

I have done:
Code:
pvecm expected 2
pvecm --votes 2
and then with vim /etc/pve/corosync.conf changed the # of votes for the mon.2 monitor to 2, but I still can't get a quorum.

Any advice on how to fix this would be great!
 
Last edited:
Hey,

Did you tried to stop your cluster, and reboot 3 node first ?
see if quorum come back. If yes, continue to boot one by one yours node, with a small delay between to node booting.
For each booted node, check if quorum staying alive