Hello
I am running into a weird issue where on a vlan a vm would not be able to find the gateway set (also a VM) using neighbor discovery.
After some debugging I found that on the cluster node where the gateway was running, the neighbor solicitation packets from the client were visible on the bridge (vmbr0) but not on the gateway itself and not on the tap-interface on the node-side. I could get some other stuff (like broadcast packets, unicast link-local) through no problem.
The key turns out to be Solicited Node Multicast Addressing.
The client was sending its solicitation icmpv6 packets correctly to `
I did the following on the cluster node:
and traffic started flowing as intended. Said command manually adds the vm's tap interface on the bridge to the given multicast group.
Of course, this is not a good permanent solution and I am unable to find a reason as to why this wasn't working correctly.
Stopping/starting the machine, migrating it to another node, etc, does not solve the issue.
I think this might be a bug somewhere in the stack but I'm not sure where. Help in narrowing this down would be greatly appreciated.
Cheers,
Marlies
I am running into a weird issue where on a vlan a vm would not be able to find the gateway set (also a VM) using neighbor discovery.
After some debugging I found that on the cluster node where the gateway was running, the neighbor solicitation packets from the client were visible on the bridge (vmbr0) but not on the gateway itself and not on the tap-interface on the node-side. I could get some other stuff (like broadcast packets, unicast link-local) through no problem.
The key turns out to be Solicited Node Multicast Addressing.
The client was sending its solicitation icmpv6 packets correctly to `
ff02::1:ff00:1
` (as the gateways ip address ends in all zeroes and a one this is correct) however the tap-interface in the bridge was *not* in the corresponding multicast group.I did the following on the cluster node:
Code:
bridge mdb add dev vmbr0 port tap105i1 grp ff02::1:ff00:1 permanent
and traffic started flowing as intended. Said command manually adds the vm's tap interface on the bridge to the given multicast group.
Of course, this is not a good permanent solution and I am unable to find a reason as to why this wasn't working correctly.
Stopping/starting the machine, migrating it to another node, etc, does not solve the issue.
I think this might be a bug somewhere in the stack but I'm not sure where. Help in narrowing this down would be greatly appreciated.
Cheers,
Marlies