corosync0 broken on Mellanox Connect-X5?

Krish90

New Member
Dec 10, 2024
6
0
1
Hi all,

Bit of a weird issue. I have a 3 node cluster, not sure how this happened but two of them are running 8.2.7 and one is running 8.3.2. They all have Mellanox Connect-X5 100G cards in them for corosync. My two nodes running 8.2.7 work just fine. The node running 8.3.2 can't ping the corosync0 IPs of the other two nodes if corosync0 is set to port 0 as the bridge port. On port 1, this works just fine. It *does* work on port 0 when the port is in promiscuous mode or if I comment out the "bridge-vlan-aware yes" line in /etc/network/interfaces for the corosync0 interface.

I can try reflashing the card but I'm very confused as to what could be happening because the configurations are identical except for the pve-manager and kernel versions. The cards are identical also, and they all use the same DACs. Any advice would be greatly appreciated, or if there's a way to reinstall an 8.2 ISO and upgrade to 8.2.7 on my bad node I can try that as well since all my containers are on the two good ones now.

Thank you!
 
Please post your /etc/network/interfaces files for all three nodes. You have not provided enough information for someone to help you.
 
  • Like
Reactions: Krish90
Have changed a couple of IPs but enp0100p0 was the port used for corosync and pinging other nodes with the same config (and different IPs) wasn't working with bridge-vlan-aware turned on. The port configuration on the switch is identical for all three nodes and only nodes 1 and 2 work (this is node 3). I'm thinking maybe either a switch issue or a bad card at this point.


auto lo
iface lo inet loopback

iface eno1 inet manual

iface enp0100p0 inet manual

iface enp0100p1 inet manual
mtu 9000

auto vmbr0
iface vmbr0 inet static
address 30.0.2.35/24
bridge-ports eno1
bridge-stp off
bridge-fd 0
#gateway 30.0.0.1

auto corosync0
iface corosync0 inet static
address 30.0.1.35/24
bridge-ports enp0100p0
bridge-stp off
bridge-fd 0
#bridge-vlan-aware yes
#bridge-vids 1000
#bridge-pvid 1
mtu 9000
 
Thanks for posting this. Please post the other two /etc/network/interfaces files (using the code formatting).

I assume you changed the IP addresses for posting here, and you do not work for the Department of Defense in the US. 30.0.0.0/8 is owned by DoD Network Information Center (DNIC). You should use private ranges. :-)
 
  • Like
Reactions: UdoB