3 node proxmox link failure, reboots node even when second link is up

TafkaMax

New Member
Jun 17, 2025
9
1
3
Hi

I have an interesting question regarding a small proxmox cluster.

1. Background

I have a 3 node proxmox cluster, that has internet/switch facing NIC in a bond and a second NIC that does a simple peer-to-peer link.

E.g. prox-01 <-> prox-02 <-> prox-03 <-> prox-01

This is a simple cost-effective way to connect all hosts. My networking configuration for each node is like this:


Code:
auto enp67s0f0np0

iface enp67s0f0np0 inet manual

  mtu 9000


auto enp67s0f1np1

iface enp67s0f1np1 inet manual

  mtu 9000


auto bond1

iface bond1 inet static

  address REDACTED/25

  netmask 255.255.255.128

  bond_slaves enp67s0f0np0 enp67s0f1np1

  bond-mode broadcast

As you can see I am using broadcast mode so packets are sent each way.

2. Problem/Issue

Recently I had a link failure between prox-01 and prox-02

2026-04-17T13:37:12.276667+03:00 prox-01 kernel: [3966775.451514] mlx5_core 0000:43:00.0 enp67s0f0np0: Link down

Then I had corosync step in:

2026-04-17 13:37:13.5412026-04-17T13:37:13.311676+03:00 proxmox-01 corosync[1984]: [KNET ] link: host: 2 link: 0 is down
2026-04-17 13:37:13.5412026-04-17T13:37:13.311983+03:00 proxmox-01 corosync[1984]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
2026-04-17 13:37:13.5412026-04-17T13:37:13.312012+03:00 proxmox-01 corosync[1984]: [KNET ] host: host: 2 has no active links
2026-04-17 13:37:15.0442026-04-17T13:37:14.944110+03:00 proxmox-01 corosync[1984]: [TOTEM ] Token has not been received in 2737 ms
2026-04-17 13:37:16.0002026-04-17T13:37:15.926696+03:00 proxmox-01 kernel: [3966779.101643] mlx5_core 0000:43:00.0 enp67s0f0np0: Link up
2026-04-17 13:37:16.0462026-04-17T13:37:15.856919+03:00 proxmox-01 corosync[1984]: [TOTEM ] A processor failed, forming new configuration: token timed out (3650ms), waiting 4380ms for consensus.
2026-04-17 13:37:16.0462026-04-17T13:37:15.926696+03:00 proxmox-01 kernel: [3966779.101643] mlx5_core 0000:43:00.0 enp67s0f0np0: Link up
2026-04-17 13:37:18.5492026-04-17T13:37:18.312509+03:00 proxmox-01 corosync[1984]: [KNET ] rx: host: 2 link: 0 is up
2026-04-17 13:37:18.5492026-04-17T13:37:18.312623+03:00 proxmox-01 corosync[1984]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
2026-04-17 13:37:18.5492026-04-17T13:37:18.312684+03:00 proxmox-01 corosync[1984]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
2026-04-17 13:37:18.5492026-04-17T13:37:18.389680+03:00 proxmox-01 corosync[1984]: [KNET ] pmtud: Global data MTU changed to: 1397
2026-04-17 13:37:18.5492026-04-17T13:37:18.408929+03:00 proxmox-01 corosync[1984]: [QUORUM] Sync members[3]: 1 2 3
2026-04-17 13:37:18.5492026-04-17T13:37:18.408985+03:00 proxmox-01 corosync[1984]: [TOTEM ] A new membership (1.a3b) was formed. Members
2026-04-17 13:37:18.5492026-04-17T13:37:18.411066+03:00 proxmox-01 corosync[1984]: [QUORUM] Members[3]: 1 2 3
2026-04-17 13:37:18.5492026-04-17T13:37:18.411107+03:00 proxmox-01 corosync[1984]: [MAIN ] Completed service synchronization, ready to provide service.

It recovered, but after a few mins it did it a few more times until it didnt recover fast enoguh and prox-01 and prox-02 rebooted themselves.

I found out that corosync can have more links so I added the secondary link as the one that connects via switch/internet, so via the other nic and made it a low priority link.

Is there any information how can I redirect the data to flow through the second link via prox-01 <-> prox-03 <-> prox-02 so corosync understands it?
 
You can see the status of both "rings" this way:
Code:
~# corosync-cfgtool  -s
Local node ID 6, transport knet
LINK ID 0 udp
        addr    = 10.3.16.7
        status:
                nodeid:          1:     disconnected
                nodeid:          2:     connected
                nodeid:          4:     connected
...
LINK ID 1 udp
        addr    = 10.11.16.7
        status:
                nodeid:          1:     disconnected
                nodeid:          2:     connected
                nodeid:          4:     connected
Code:
~# corosync-cfgtool  -n
Local node ID 6, transport knet
nodeid: 2 reachable
   LINK: 0 udp (10.3.16.7->10.3.16.9) enabled connected mtu: 1397
   LINK: 1 udp (10.11.16.7->10.11.16.9) enabled connected mtu: 1397

nodeid: 4 reachable
   LINK: 0 udp (10.3.16.7->10.3.16.10) enabled connected mtu: 1397
   LINK: 1 udp (10.11.16.7->10.11.16.10) enabled connected mtu: 1397
  
...
(Important: my "nodeid 1" is disconnected on purpose. Yours should be "connected", of course.)

As long as everything is "connected" one ring may get lost without losing Quorum. See also man corosync-cfgtool
 
This has nothing to do with corosync at all. If one of the links fails you lose IP connectivity between those two nodes. The third node simply isn't relaying ANY traffic between the two and there's nothing corosync can do about it.

I suggest you use a different scheme for the direct links. There are a few suggestions on the wiki and others are possible too. Even STP should work better.
 
I meant https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server While the page creates the network to run ceph simply ignore that part. The network doesn't need ceph to work.

For (R)STP you would do away with the bond and use both interfaces as ports on a STP enable bridge. The IP would indeed go onto the bridge.

I think newer PVE versions have support for some methods under SDN/Fabric but I haven't looked into it.

I suggest you check out if of that works you for you. If not I can probably find some more.
 
Last edited: