bond health check

Jun 9, 2025
17
2
3
Hi,
OK, doing more work on my ProxMox build - 8 hosts across two fiber-connected server rooms with each server room having shared storage (currently NFS but adding a SAN soon). For the shared storage, I'm working on network HA/failover and on my test host have configured a bond using 2 10GB/s nics in "active-backup" mode. Each nic in the bond connects to a different switch and those switches uplink to the shared storage in each server room...pretty normal stuff for failover.

Failover works...mostly - disconnecting the host from either switch (including rebooting the switch) causes the host to immediately switch over to the other link in the bond and my guests do not appear to miss a beat. The issue I'm currently trying to work out is when the switch connected to the "bond-primary" nic is disconnected at its uplink to the shared storage but the link to the host remains active. In this scenario, the bond does not switch over since it still senses the primary link as up.

Is there a way to do "bond health" in this scenario? My biggest issue under normal circumstances would be when I need to update firmware, etc. on a switch in server room 2 that supports storage used by guests in server room 1 (or vice-versa).

Thanks