Hello
We have a cluster of > 30 servers. Initially it did NOT have a dedicated Corosync network as we felt the host network would be sufficient (2 X 10Gbe interfaces each), however we started having Corosync issues once we got to about 30 servers so then decided to put in a dedicated Corosync network. We added 2 X 48 Port 1Gbe switches, cabled up each host to go to both switches and set up Corosync to use both interfaces. In other words we now have a dedicated Corosync network with 2 X 1Gbe interfaces per host. This however doesn't seem to have made a difference.
We still end up with hosts that seem to "fall out" of the cluster. In other words in the Web Gui they show up as red, if you log into one of those hosts directly it appears like they have established themsleves in their own cluster.
I tried stopping pve-cluster and cororsync services on all hosts, then slowly starting it up 1 by 1. As I start them up on hosts it all looks good, with hosts appearing in the cluster as expected, until we get between 20-30 hosts running then it starts happening again
Running Proxmox 7.4 with latest patches.
Any advice?
We have a cluster of > 30 servers. Initially it did NOT have a dedicated Corosync network as we felt the host network would be sufficient (2 X 10Gbe interfaces each), however we started having Corosync issues once we got to about 30 servers so then decided to put in a dedicated Corosync network. We added 2 X 48 Port 1Gbe switches, cabled up each host to go to both switches and set up Corosync to use both interfaces. In other words we now have a dedicated Corosync network with 2 X 1Gbe interfaces per host. This however doesn't seem to have made a difference.
We still end up with hosts that seem to "fall out" of the cluster. In other words in the Web Gui they show up as red, if you log into one of those hosts directly it appears like they have established themsleves in their own cluster.
I tried stopping pve-cluster and cororsync services on all hosts, then slowly starting it up 1 by 1. As I start them up on hosts it all looks good, with hosts appearing in the cluster as expected, until we get between 20-30 hosts running then it starts happening again
Running Proxmox 7.4 with latest patches.
Any advice?