I have a situation where a node failed (due to the boot drive failing) and then another node failed (due to RAM failure). There are 7 nodes in the cluster, so things kept running, but eventually there were many writes that could not be redundantly stored and the whole thing ground to a halt.
After sourcing some drives and RAM, I installed it and restarted the cluster. However, one of the nodes (that hosts the 3rd monitor) doesn't want to join the cluster. There are 3 monitors that are available, but no quorum.
mon.s5 is down because node s5 is down.
In practice, any ceph command times out because there is not quorum and thus the cluster is down.
I have done:
and then with
Any advice on how to fix this would be great!
After sourcing some drives and RAM, I installed it and restarted the cluster. However, one of the nodes (that hosts the 3rd monitor) doesn't want to join the cluster. There are 3 monitors that are available, but no quorum.
mon.s5 is down because node s5 is down.
In practice, any ceph command times out because there is not quorum and thus the cluster is down.
I have done:
Code:
pvecm expected 2
pvecm --votes 2
vim /etc/pve/corosync.conf
changed the # of votes for the mon.2 monitor to 2, but I still can't get a quorum.Any advice on how to fix this would be great!
Last edited: