Cluster Network
---------------
The cluster network is the core of a cluster. All messages sent over it have to
be delivered reliable to all nodes in their respective order. In {pve} this
part is done by corosync, an implementation of a high performance low overhead
high availability development toolkit. It serves our decentralized
configuration file system (`pmxcfs`).
Network Requirements
~~~~~~~~~~~~~~~~~~~~
This needs a reliable network with latencies under 2 milliseconds (LAN
performance) to work properly. While corosync can also use unicast for
communication between nodes its **highly recommended** to have a multicast
capable network. The network should not be used heavily by other members,
ideally corosync runs on its own network.
*never* share it with network where storage communicates too.
Before setting up a cluster it is good practice to check if the network is fit
for that purpose.
* Ensure that all nodes are in the same subnet. This must only be true for the
network interfaces used for cluster communication (corosync).
* Ensure all nodes can reach each other over those interfaces, using `ping` is
enough for a basic test.
* Ensure that multicast works in general and a high package rates. This can be
done with the `omping` tool. The final "%loss" number should be < 1%.
omping -c 10000 -i 0.001 -F -q NODE1-IP NODE2-IP ...
* Ensure that multicast communication works over an extended period of time.
This uncovers problems where IGMP snooping is activated on the network but
no multicast querier is active. This test has a duration of around 10
minutes.
omping -c 600 -i 1 -q NODE1-IP NODE2-IP ...
Your network is not ready for clustering if any of these test fails. Recheck
your network configuration. Especially switches are notorious for having
multicast disabled by default or IGMP snooping enabled with no IGMP querier
active.
In smaller cluster its also an option to use unicast if you really cannot get
multicast to work.