[TUTORIAL] 3 node cluster, dual ring, with Mixed 10 GbE, 2.5 GbE and 1 GbE Mesh with no switch for second higher speed ring.

v95klima · May 4, 2024

3 node cluster, dual ring, with Mixed 10 GbE, 2.5 GbE and 1 GbE Mesh with no switch for second higher speed ring.

To set up a mixed network:

You'll need 3 NICs (one per node) for ring 1, the Internet/browsing ring that has gateway out.
Then you need additional 6 NICs (two per node) for the second ring, with no gateway, just for high speed, in between Node communication and file transfer.

plug in all nics to be used on each node, draw out the nic names on a paper by looking carefully at command
ip a
easy to make naming mistakes, pay attention to details.

Start by finding maxmtu for all NICs. An older 1 GbE sometime will not support 9000 mtu. In my case limit was 4088 on a older 1 GbE USB based NIC, which I need to keep part of Mesh for now until budget increases.
ip -d link list

Always create two corosync rings, when possible.

The first ring can be on the initial 192.168.1.x with vmbr0 and with gateway active to internet.

cat /etc/network/interfaces
auto lo
iface lo inet loopback
auto eno1
iface eno1 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.1.9/24
gateway 192.168.1.254
bridge-ports eno1
bridge-stp off
bridge-fd 0

So three nics used for first ring connected to Home router with switch. Typically 1 GbE.
Then connect up the second ring, using six additional nics. See link below Full_Mesh_Network_for_Ceph_Server#Routed_Setup_(Simple)

Example from one node added into, as additional two nics.
Note that since this is separate network, 192.168.3.x from the one above it can have mtu set, where as the one above remains at typical 1500. This second network does not need a gateway or vbmr. it does need route to be added.
cat /etc/network/interfaces
auto enp9s0
iface enp9s0 inet static
mtu 4088
address 192.168.3.1
up ip route add 192.168.3.2/32 dev enp9s0
down ip route del 192.168.2.5/32
post-up /sbin/ethtool -s enp9s0 wol g
auto enp4s0f1
iface enp4s0f1 inet static
mtu 4088
address 192.168.3.6
up ip route add 192.168.3.5/32 dev enp4s0f1
down ip route del 192.168.3.5/32

observe above for Static address, there is no /24 after, as when using vmbr. /24 will mess up ip route later if not observed in static.

Check the route on each node by ping and using command
ip route
root@pve:~# ip route
default via 192.168.1.254 dev vmbr0 proto kernel onlink
192.168.1.0/24 dev vmbr0 proto kernel scope link src 192.168.1.9
192.168.3.2 dev enp9s0 scope link
192.168.3.5 dev enp4s0f1 scope link

On each node after changing /etc/network/interfaces, restart networking with command
systemctl restart networking
Edit the /etc/pve/corosync.conf
Recommend adding an extra vote to the one server machine that is likely always on, or least likely to be rebooting/off.
quorum_votes: 2
instead of
quorum_votes: 1
for such machine.

Add the second ring ips for each node:
ring0_addr: 192.168.3.1
ring1_addr: 192.168.1.9

Give the higher speed network a higher knet_link_priority
interface {
knet_link_priority: 255
linknumber: 0
}
interface {
knet_link_priority: 128
linknumber: 1
}

after saving and rebooting all nodes
Check Cluster networking with:

for each NIC (example nic: eno1)
ethtool eno1
ping

for each node
ip a
ip route
pvecm status
systemctl status corosync
dmesg
journalctl

More info:
https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Routed_Setup_(Simple)
https://pve.proxmox.com/wiki/Cluster_Manager#_preparing_nodes

Comment, 2 Nodes only:
This is how I started and the quorum_votes: 2 is really important in that case.
Such setup only requires 2 NICs for each node, 4 in total and is a great starting point.
It does not even need the added routing:
up ip route add 192.168.3.5/32 dev enp4s0f1
down ip route del 192.168.3.5/32
that the above 3 node design needs.

Search

Search

[TUTORIAL] 3 node cluster, dual ring, with Mixed 10 GbE, 2.5 GbE and 1 GbE Mesh with no switch for second higher speed ring.

v95klima

Member