VM connectivity lost after bond failover

apollo13

Well-Known Member
Jan 16, 2018
61
8
48
Hi there,

I am seeing a loss of connectivity after bond failover of an active-backup bond to two switches. My setup is as follows:

Code:
+-------+                                +-------+
|switch1|----(inter switch lacp trunk)---|switch2|----client
+-------+                                +-------+
    |                                        |
    | primary      +-------+          backup |
    ---------------|proxmox|------------------
                   +-------+

The LACP inter switch connection is working fine and connectivity with only a single bond slave as well.

My bond configuration is basically this:
Code:
auto enp65s0f0
iface enp65s0f0 inet manual

auto enp65s0f1
iface enp65s0f1 inet manual

auto bond2
iface bond2 inet manual
    bond-slaves enp65s0f0 enp65s0f1
    bond-miimon 100
    bond-mode active-backup
    bond-primary enp65s0f0
#VM-01

auto vmbr0
iface vmbr0 inet manual
    bridge-ports bond2
    bridge-stp off
    bridge-fd 0
    bridge-vlan-aware yes
    bridge-vids 2-4094
#VM-BRIDGE-01

Now the following seems to happen / is used for testing:
* The client pings a VM on the proxmox server
* I pull the primary cable to test the failover
* /proc/net/bonding/bond2 confirms the bond fails over properly
* Ping no longer gets answers
* Looking into the mac address table on switch2 I see that the MAC of the VM is still bound to the inter switch trunk
* Reattaching the primary cable results in ping to get answers again

This kinda makes sense since the MAC address table on the switch has not had a reason for an update yet. So what to do? One thing I can imagine is utilizing qemu's monitor feature to send an announce_self for all the VMs running when the bond fails over (I think qemu does this after live migration as well). This would result in GARP brodcasted and would force all switches in the broadcast domain to update their records.

* Are those assumptions correct?
* Assuming they are correct are there any good ways to monitor bond failover (short of polling /proc/net/bonding/bond2) so I can script this myself?
* Is this something which would make sense to add to proxmox itself? The setup on it's own doesn't seem to be that uncommon; so I am wondering why noone else is seeing this (or are they and I just couldn't find it in the forum).

Thanks,
Florian
 
I have managed to work around this problem using the following simple python script:
Python:
import socket
import pathlib
import time
import sys
import pyroute2
import logging

QEMU_SERVER_PATH = "/var/run/qemu-server/"


def announce_self(qmp_path):
    with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as sock:
        sock.settimeout(0.1)
        sock.connect(str(qmp_path))
        sock.recv(1024)  # read initial message
        sock.sendall(
            b'{"execute":"qmp_capabilities"}\n{"execute":"announce-self","arguments":{"initial":50,"max":550,"rounds":4,"step":50}}'
        )
        sock.recv(1024)  # read first command
        sock.recv(1024)  # read second command


def announce_all():
    for p in pathlib.Path(QEMU_SERVER_PATH).iterdir():
        if p.name.endswith(".qmp"):
            try:
                announce_self(p)
            except Exception as e:
                logging.error(f"Failure announcing {p}: {e}")


if __name__ == "__main__":
    logging.basicConfig(
        format="%(asctime)s [%(levelname)s] %(message)s", level=logging.INFO
    )
    if len(sys.argv) < 2:
        print("Usage: bond_watch.py BOND", file=sys.stderr)
        sys.exit(1)
    ifname = sys.argv[1]
    active_slave = None
    ipr = pyroute2.IPRoute()
    logging.info(f"Starting bond watch on {ifname}")
    while True:
        link = ipr.get_links(ifname=ifname)[0]
        ifinfo = link.get_attr("IFLA_LINKINFO")
        bond = ifinfo.get_attrs("IFLA_INFO_DATA")[0]
        new_slave = bond.get_attr("IFLA_BOND_ACTIVE_SLAVE")
        if new_slave != active_slave:
            if active_slave is not None:
                logging.info("Detected failover")
                announce_all()
        active_slave = new_slave
        time.sleep(1)

It is rather inefficient and I am wondering if I somehow can get events from the kernel.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!