I have a working cluster using broadcast for cluster communication. This method, plus the default cluster conf, and the number of 7 nodes probably lead to many problems. Nodes get fenced with no particular reason and ha vms get restarted frequently and randomly. problems like these 2 posts
http://forum.proxmox.com/threads/9743-corosync-364738-TOTEM-Retransmit-List-ca8-ca9-caa-cab
http://forum.proxmox.com/threads/13656-KVM-VMs-are-restarted-erratically
I dont have hardware or network problems.
Past week I tried some conf changes and to stay with no more than 5 nodes, with no result.
changes i did to the default conf
I spoke with my provider and we found the way to use multicast for my cluster communication.
So now i want to check if problems get solved with multicast.
This is a major change and i think i must restart the whole cluster to get the new conf to all machines. I guess that after the change if i reboot one node it will never find quorum, yes? The previous cluster nodes will still comunicate in broadcast. So i have to stop all machines and boot them one by one? I fear that i will never get quorum when they boot again.
Is this the proper way to make this change?
Also should i delete now token conf changes
http://forum.proxmox.com/threads/9743-corosync-364738-TOTEM-Retransmit-List-ca8-ca9-caa-cab
http://forum.proxmox.com/threads/13656-KVM-VMs-are-restarted-erratically
I dont have hardware or network problems.
Past week I tried some conf changes and to stay with no more than 5 nodes, with no result.
changes i did to the default conf
Code:
<cman broadcast="yes" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<totem token="54000" window_size="150"/>
<rm status_child_max="20" status_poll_interval="20">
I spoke with my provider and we found the way to use multicast for my cluster communication.
So now i want to check if problems get solved with multicast.
This is a major change and i think i must restart the whole cluster to get the new conf to all machines. I guess that after the change if i reboot one node it will never find quorum, yes? The previous cluster nodes will still comunicate in broadcast. So i have to stop all machines and boot them one by one? I fear that i will never get quorum when they boot again.
Is this the proper way to make this change?
Also should i delete now token conf changes