IGMP over switch uplink issue

Ingo S

Renowned Member
Oct 16, 2016
333
38
68
41
Hello forum

Today i have spend some hours, setting up a new cluster. Its structure will be two nodes on one Cisco Nexus 3064-X, an uplink to another Cisco Nexus 3064-X where two additional nodes are connected.

I have 3 separate LANs, one for user access (VLAN Trunk) , one for ceph (not used by now) and one for corosync (VLAN116)

When i set up the first two nodes, both connected to the first switch, i checked LAN connectivity on all three networks, which worked fine (ping in both directions). Then I tested if IGMP works with omping and everything is just fine with nearly 0 packet loss. Created the cluster with bindnet0_address and ring0_addr corresponding to the interfaces on the nodes. Everything worked just right.

When i wanted to add the third node on the second switch i pinged -> fine, tested IGMP -> seems fine, added the node to the cluster with correct ring0_addr. -> fine!

After some minutes, the third node isnt seen by corosync anymore. The two nodes on the first switch are seeing each other, the third node seems offline.

On the cisco i found out, when the thrid node joined the cluster, the second switch became member of the same multicast group that the nodes on switch 1 chose for corosync. After some minutes, the second switch looses its membership

Code:
Switch1:
n3064-x1# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
116   239.192.88.241     v3   D     Eth1/9 Eth1/17

Switch 2
n3064-x2# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
116   */*                -    R     Eth1/49
116   239.192.88.241     v3   D     Eth1/9 Eth1/19

Switch 2 some minutes later:
n3064-x2# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
116   */*                -    R     Eth1/49

Switch one is configured to be the IGMP querier:
Code:
n3064-x1# sh ip igmp snooping vlan 116
IGMP Snooping information for vlan 116
  IGMP snooping enabled
  Optimised Multicast Flood (OMF) disabled
  IGMP querier present, address: 192.168.16.1, version: 3
  Switch-querier enabled, address 192.168.16.1, currently running
  IGMPv3 Explicit tracking enabled
  IGMPv2 Fast leave disabled
  IGMPv1/v2 Report suppression enabled
  IGMPv3 Report suppression disabled
  Link Local Groups suppression enabled
  Router port detection using PIM Hellos, IGMP Queries
  Number of router-ports: 0
  Number of groups: 1
  VLAN vPC function disabled
  Active ports:
    Eth1/9      Eth1/17 Eth1/49

Can someone give me a hint where to look at, to track this down?

I suppose, Switch 2 does not tell Switch 1 that it has a member of the Multicast group 239.192.88.241 so that Switch 1 would continue forwarding its multicast traffic to Switch 2. But why?
How can i check if IGMP Querys reach Switch2?
Why would IGMP querys not reach Switch2?

I get the same behaviour when i run omping for some longer time e.g. 10min.

Thank you in advance
 
Hello Jarek

Since there will be a lot of other hosts on these switches in the future and there will be several VLAN Trunks to other switches located all around our site, this would mean multicast packets travelling all around our network. This can be a security risk and also cause unneccessary bandwith usage on the uplink ports.

Besides that, i would like a solution that gets me some deeper insight into multicast since i do not seem to understand IGMP as deeply as i might need.

Greetings
Ingo
 
If you don't know what the igmp snooping exactly do, you don't need it.
 
Jarek, I am pretty sure, I know what it does.

IGMP snooping listens for the IGMP responses of hosts, that are interested in receiving a specific multicast stream.
This way the switch learns to which ports it has to deliver the traffic for specific multicast groups, so it doesn't get spread around the whole VLAN. This is exactly what i want.

I am aware, that the only purpose of VLAN116 is to carry corosync multicast traffic of PVE hosts. So I indeed could just turn snooping off. The multicast traffic would obviously stick to its VLAN. But this is somewhat a cure to a symptom but not a real solution to the underlying problem.

I like to solve problems cleanly and tidily, so if you cannot, or do not want to help me out, maybe someone else can. ;)

Greetings
Ingo
 
Last edited:
Hi Fabio

as far as i can remember, we did not find the cause for the issue, but i did look for an igmp querier on every device connected to this network, that i could find and turned it off. Then i turned on a single igmp querier on one of the switches. Since then the problem seems to be gone.
I assume, there was more than one querier and they interfered with each other.

IGMP is still a bit of a mistery to me.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!