Cluster stop running after one of the member die.

nwongrat

Member
Feb 16, 2023
34
0
6
I just tried to use clustering without HA. Just add 2 servers in the same basic cluster. I did not do any other configuration. Then one of them die. The other server was running as usual until I restart them.

After I restart the remaining server. It could not be started. Showed error about looking for the dying server. I could not remember what was the exact error. In breif, I could not turn on the other server anymore and I have to re-install proxmox for those 2 servers. After that, I kinda fear of doing clustering.

Basic, question. If any of the member in the cluster die, is there a way to prevent the problem I mention above?

Thank for your help.
 
You need minimum of 3 members to get quorum.

With 2 it works until it does not.
That was what I afraid of. Is there a way to remove the remaining server from cluster without re-installing proxmox like I did.

Thank you for your reply.
 
You need minimum of 3 members to get quorum.

With 2 it works until it does not.

Correct me if I am wrong. In my case I have totally 4 nodes. If it goes down 2 of them. The other 2 can still function. However, if they were down 3 of them, just 1 remaining. It will work until I shut down or restart. Then it will not wake up again. Am I correct?

If I was correct. Can I assume that, try my best to have at least 2 nodes alive.
 
You need minimum 51% (exactly anything over 50%) remaining nodes, otherwise the cluster stops.
2/3=0,66
2/4=0,5 X (this is exactly 50% and NOT over 50%, so same as 1/2 as you had)
2/5=0,4 X
3/4=0,75
3/5=0,6

Then it will not wake up again. Am I correct?
It will wake up again if you have over 50%, regardless in which order you reboot the nodes. So yes, if you have 4 nodes and 2 remaining it stops. It will wake up again if a (any) third nodes comes back online.
 
Last edited:
I guessed, the best way to do, I will have to wait until I have 6 nodes.

Thanks a lot for your help.
 
try my best to have at least 2 nodes alive.
This only works with 3 nodes -> 2/3
With more nodes you never come over 50%, because 2/6=0,33 X

Thanks a lot for your help.
;)

Think about the other way round, how many should be allowed to die.
One dead node needs 3 in total
Two dead nodes need 5 in total
Three dead nodes need 7 in total
...9,11,13
In all these examples the cluster keeps running!
 
Last edited:
What if, I installed Proxmox and VM to separate disk. When I do have to reinstall proxmox. Will the VMs come back without reinstall VMs? In case that I did not do the back up.
 
With 3 nodes you can set ceph on top and with HA you can then set that if a node goes down, the vms will be bootet on another node.
 
With 3 nodes you can set ceph on top and with HA you can then set that if a node goes down, the vms will be bootet on another node.
I understand that, however, I would not dare to touch HA and ceph with my tiny knowledge for now. Basically, I just need one console for all 4 nodes that was why I tried clustering. If in case that when node die and I need to re-install proxmox. If it could be done just proxmox (not the VM) it would be ok for now. Otherwise, I will stay single node until I am stronger.

After that day, The day that I have to reinstall everything. I kinda fear for trying something new without enough knowledge on that. I was just done setup everything after migrate from ESXi. Then.......xxxxxx happen.... :(

Thanks you.
 
The shutdown of the remaining nodes in case that they have no majority it's just a security feature to avoid VM's running twice, in case that a cluster is broken. If you know that this is not the case, you can just decrease the number of expected nodes with "pvecm expected 1" and even a single node will stay alive.
Another option would be to disable the HA services (pve-ha-lrm, pve-ha-crm) to avoid the fencing
 
The shutdown of the remaining nodes in case that they have no majority it's just a security feature to avoid VM's running twice, in case that a cluster is broken. If you know that this is not the case, you can just decrease the number of expected nodes with "pvecm expected 1" and even a single node will stay alive.
Another option would be to disable the HA services (pve-ha-lrm, pve-ha-crm) to avoid the fencing
Thanks, I will try it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!