Hello forum
Today i have spend some hours, setting up a new cluster. Its structure will be two nodes on one Cisco Nexus 3064-X, an uplink to another Cisco Nexus 3064-X where two additional nodes are connected.
I have 3 separate LANs, one for user access (VLAN Trunk) , one for ceph (not used by now) and one for corosync (VLAN116)
When i set up the first two nodes, both connected to the first switch, i checked LAN connectivity on all three networks, which worked fine (ping in both directions). Then I tested if IGMP works with omping and everything is just fine with nearly 0 packet loss. Created the cluster with bindnet0_address and ring0_addr corresponding to the interfaces on the nodes. Everything worked just right.
When i wanted to add the third node on the second switch i pinged -> fine, tested IGMP -> seems fine, added the node to the cluster with correct ring0_addr. -> fine!
After some minutes, the third node isnt seen by corosync anymore. The two nodes on the first switch are seeing each other, the third node seems offline.
On the cisco i found out, when the thrid node joined the cluster, the second switch became member of the same multicast group that the nodes on switch 1 chose for corosync. After some minutes, the second switch looses its membership
Switch one is configured to be the IGMP querier:
Can someone give me a hint where to look at, to track this down?
I suppose, Switch 2 does not tell Switch 1 that it has a member of the Multicast group 239.192.88.241 so that Switch 1 would continue forwarding its multicast traffic to Switch 2. But why?
How can i check if IGMP Querys reach Switch2?
Why would IGMP querys not reach Switch2?
I get the same behaviour when i run omping for some longer time e.g. 10min.
Thank you in advance
Today i have spend some hours, setting up a new cluster. Its structure will be two nodes on one Cisco Nexus 3064-X, an uplink to another Cisco Nexus 3064-X where two additional nodes are connected.
I have 3 separate LANs, one for user access (VLAN Trunk) , one for ceph (not used by now) and one for corosync (VLAN116)
When i set up the first two nodes, both connected to the first switch, i checked LAN connectivity on all three networks, which worked fine (ping in both directions). Then I tested if IGMP works with omping and everything is just fine with nearly 0 packet loss. Created the cluster with bindnet0_address and ring0_addr corresponding to the interfaces on the nodes. Everything worked just right.
When i wanted to add the third node on the second switch i pinged -> fine, tested IGMP -> seems fine, added the node to the cluster with correct ring0_addr. -> fine!
After some minutes, the third node isnt seen by corosync anymore. The two nodes on the first switch are seeing each other, the third node seems offline.
On the cisco i found out, when the thrid node joined the cluster, the second switch became member of the same multicast group that the nodes on switch 1 chose for corosync. After some minutes, the second switch looses its membership
Code:
Switch1:
n3064-x1# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port
Vlan Group Address Ver Type Port list
116 239.192.88.241 v3 D Eth1/9 Eth1/17
Switch 2
n3064-x2# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port
Vlan Group Address Ver Type Port list
116 */* - R Eth1/49
116 239.192.88.241 v3 D Eth1/9 Eth1/19
Switch 2 some minutes later:
n3064-x2# sh ip igmp snooping groups vlan 116
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port
Vlan Group Address Ver Type Port list
116 */* - R Eth1/49
Switch one is configured to be the IGMP querier:
Code:
n3064-x1# sh ip igmp snooping vlan 116
IGMP Snooping information for vlan 116
IGMP snooping enabled
Optimised Multicast Flood (OMF) disabled
IGMP querier present, address: 192.168.16.1, version: 3
Switch-querier enabled, address 192.168.16.1, currently running
IGMPv3 Explicit tracking enabled
IGMPv2 Fast leave disabled
IGMPv1/v2 Report suppression enabled
IGMPv3 Report suppression disabled
Link Local Groups suppression enabled
Router port detection using PIM Hellos, IGMP Queries
Number of router-ports: 0
Number of groups: 1
VLAN vPC function disabled
Active ports:
Eth1/9 Eth1/17 Eth1/49
Can someone give me a hint where to look at, to track this down?
I suppose, Switch 2 does not tell Switch 1 that it has a member of the Multicast group 239.192.88.241 so that Switch 1 would continue forwarding its multicast traffic to Switch 2. But why?
How can i check if IGMP Querys reach Switch2?
Why would IGMP querys not reach Switch2?
I get the same behaviour when i run omping for some longer time e.g. 10min.
Thank you in advance