Proxmox Cluster using a ethernet switch.

schererpi

New Member
Sep 9, 2021
1
0
1
34
Hello guys, how are you? I hope you're fine.

I'm running three PVE nodes in a cluster and all of them are connected through an ethernet switch (this switch runs rstp by default).

Each PVE uses more than one ethernet cable in a bonding (LACP 802.3ad). The bonding interface is under the bridge vmbr0, and this bridge is configured with "bridge-stp off". So, I have:

bond-pve01 --> ------------------------------
| |
bond-pve02 --> | ethernet switch |
| |
bond-pve03 --> -------------------------------

When something happens with a cable or an interface, we have problems with the whole cluster. PVE logs shows information about stp port states changing (to discard, listening and forwarding for example), and all the cluster becomes a mess because it reboots all servers. After a while, everything is fine again.

My idea about it is that something is happening with bridge stp and is messing the communication between the nodes. Do you have some tips about using stp on PVE? Or maybe if you already had this experience, I'll be glad if you share some fix.

Thank you!