G
gfoster
Guest
Hi Folks,
I've run into a few problems with CMAN/Corosync lately where the cluster will lose quorum and/or go offline. All nodes except the master show red in the GUI, although sometimes many nodes are registered via corosync (and show as such in /var/log/cluster/corosync.log). We've switched from multicast to unicast, and are still seeing issues sometimes when changes are made (such as adding a node), where the cluster will go offline.
Currently I have (26 nodes in the cluster, and I understand that (16) is the recommended limit. Are there some recommended tuning parameters for a cluster of this size ?
In the event where things become confused (and yes, I agree they shouldn't), what is the recommended process for bringing all PVE related processes down, or restarting them on nodes which are currently having problems?
Side note: Sometimes when executing commands via SSH from the master, I'm still prompted for a password. Is this due to keys being stored in (/etc/pve/priv), which may not be available when pve-cluster is offline ?
Thanks in advance,
Greg.
I've run into a few problems with CMAN/Corosync lately where the cluster will lose quorum and/or go offline. All nodes except the master show red in the GUI, although sometimes many nodes are registered via corosync (and show as such in /var/log/cluster/corosync.log). We've switched from multicast to unicast, and are still seeing issues sometimes when changes are made (such as adding a node), where the cluster will go offline.
Currently I have (26 nodes in the cluster, and I understand that (16) is the recommended limit. Are there some recommended tuning parameters for a cluster of this size ?
In the event where things become confused (and yes, I agree they shouldn't), what is the recommended process for bringing all PVE related processes down, or restarting them on nodes which are currently having problems?
Side note: Sometimes when executing commands via SSH from the master, I'm still prompted for a password. Is this due to keys being stored in (/etc/pve/priv), which may not be available when pve-cluster is offline ?
Thanks in advance,
Greg.