[SOLVED] Making best use of multiple NICs per Node

maxim.webster

Active Member
Nov 12, 2024
290
141
43
Germany
Dear all,

I am upgrading the hardware of my 3-node-homelab-cluster. Currently, every node has a single NIC (GbE).
The new hardware has 4 NICs per node (2.5GbE) and I want to make best use of them.

How should I set-up routing for my cluster, given
* corosync
* multi-vlan-traffic from and to the network using a vlan-aware bridge (vmbr0)
* no distributed storage (Ceph), but ZFS mirrors on each node with HA and replication

Thanks in advance

Maxim
 
Hi Maxim,

with 4x 2.5GbE per node and your setup using Corosync, HA and ZFS replication without Ceph, I would separate the traffic types as much as possible to keep the cluster stable and easier to troubleshoot.

I would use one dedicated NIC only for Corosync traffic with static IPs and without bonding. This keeps cluster communication isolated and avoids issues during high network load.

For the normal VM, management and VLAN traffic, I would use two NICs together for the VLAN-aware bridge (vmbr0). Active/Passive bonding is completely sufficient for a homelab and gives good redundancy. If your switch supports proper LACP/802.3ad, you could also use that instead.

The remaining NIC would then be used exclusively for ZFS replication traffic. Having a dedicated replication network keeps large replication jobs away from Corosync and VM traffic and improves overall cluster stability during failover or sync operations.

If you do not need LACP or other switch-based features, you could also use two direct connections between the Proxmox nodes only for replication traffic by using two ports per node for an isolated internal replication network. That keeps the replication traffic completely separated from the main network and avoids unnecessary load on the switch infrastructure.

That is actually the setup I use in my own homelab and it works very reliably for ZFS replication.

Regards
Jonas
 
Hardware has arrived and is working well. So far, only the first NIC (enp5s0) is in use for all kinds of traffic.

Since my current network infrastructure is limited to GbE (and 10GbE SFP+) - is it possible to inter-connect the 3 nodes directly to each other (a to b, a to c and b to c) to build a (ZFS-) migration network?
 
is it possible to inter-connect the 3 nodes directly to each other
Sure. That approach is called a mesh: https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Example

I do not use it, nowhere. You cannot expand it easily: to add a fourth member you would need to utilize a third NIC on each node.

After setup you can test it, a simple "ping" from each to each other node is sufficient. It should still work if one cable is removed.

Then you go to "Datacenter --> Options --> Migration Settings" and select the appropriate network to be used.
 
Last edited:
  • Like
Reactions: maxim.webster
The referenced documentation is quite extensive, with a lot of options. I'll assume that for migration/replication only the "Routed Setup (Simple)" with no Fabric involved is sufficient?

I'd recommend still going for the fabric, as it provides fault tolerance/failover and you have status reporting via the Web UI. it is relatively straightforward to set up and our current recommended way for full-mesh.
 
  • Like
Reactions: maxim.webster
May I ask a last question: Since each board 4 NICs, 1 (enp6s0) is yet unused:

  • enp7s0 and enp8s0 are in use for the "full mesh network" using Fabrics and assigned for replication and migration
  • enp5s0 is for everything else, WebUI, Corosync, VM and LXC guest via VLAN-aware bridge vmbr0

So, how do I need to adjust my /etc/network/interfaces to "move" vmbr0 to enp6s0 on every host? So that enp5s0 is exclusively for Proxmox adminstration and Corosync?

My current VLANs are
  1. "CLUSTER" (192.168.10.0/24) for Proxmox Nodes
  2. "HOME" (192.168.20.0/24) for VMs and LXC for internal use
  3. "DMZ" (192.168.40.0/24) for VMs and LXC with external accesss

Current content of /etc/network/interfaces

Code:
auto lo
iface lo inet loopback

iface enp5s0 inet manual

iface enp6s0 inet manual

iface enp7s0 inet manual

iface enp8s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.10.4/24
        gateway 192.168.10.1
        bridge-ports enp5s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

source /etc/network/interfaces.d/*
 
I'm not sure if I understood your question correctly.
From my point of view, you should assign the static IP address 192.168.10.X/24 to enp5s0.
Then, you would bridge vmbr0 to enp6s0 so that your Corosync traffic runs through vmbr0.
What I don't understand is the distinction you're making between Proxmox administration and the web UI.
Could you clarify what you mean by that?


Anyway, the configuration would look something like this:


Code:
auto lo
iface lo inet loopback

auto enp5s0
iface enp5s0 inet static
        address 192.168.10.4/24
        gateway 192.168.10.1

iface enp6s0 inet manual

iface enp7s0 inet manual

iface enp8s0 inet manual

auto vmbr0
iface vmbr0 inet manual
        bridge-ports enp6s0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094

source /etc/network/interfaces.d/*
 
What I don't understand is the distinction you're making between Proxmox administration and the web UI.

your version of the interfaces-file is exactly what I wanted: To seperate guest-traffic (VM, LXC) from „management“-traffic: Corosync and accessing the hosts via Proxmox WebUI or SSH.

Thanks (again).