Enabling Multicast querier on bridges is unstable

lifeboy

Renowned Member
I have a VLAN provided by Juniper Switches hosted at Hetzner (ZA). The have IGMP snooping enabled so I need to have multicast querier running. However, I run into a problem when installing Proxmox 5.2 (note: this may be similar under previous versions, but I have my own servers and switches in the order clusters, so IGMP snooping doesn't apply)

When I check if multicast is available with "corosync-cmapctl -g totem.interface.0.mcastaddr", I get an error, since corosync is not configured yet.
Code:
root@yster5:~# corosync-cmapctl -g totem.interface.0.mcastaddr
Failed to initialize the cmap API. Error CS_ERR_LIBRARY

Also
Code:
root@yster5:~# omping -c 600 -i 1 -q yster4 yster5
yster4 : waiting for response msg
yster4 : joined (S,G) = (*, 232.43.211.234), pinging
yster4 : waiting for response msg
yster4 : server told us to stop

yster4 :   unicast, xmt/rcv/%loss = 31/31/0%, min/avg/max/std-dev = 0.086/0.169/0.218/0.034
yster4 : multicast, xmt/rcv/%loss = 31/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

So multicast packets are not allowed by multicast querier.

However, as soon as I configure corosync on the first node with "pvecm create <clustername>, I can run omping and corosync-capctl.

The problem is that I can't add the 2nd or 3rd node to the initial node without first somehow configuring corosync. If I add the node anyway with "pvecm add yster4" from yster5 (with yster4 being the first node), it can't reach quorum because somehow yster4 looses it's ability to receive the multicast packets. Leaving yster5 running, if I reboot yster4, as soon as yster 4 is back, yster5 thinks it has achieved quorum, but it actually hasn't. Now mutlicast packets are not received by yster4 and omping shows all packets are lost.

From yster4:
Code:
root@yster4:~# omping -c 600 -i 1 -q yster4 yster5
yster5 : waiting for response msg
yster5 : waiting for response msg
yster5 : waiting for response msg
yster5 : joined (S,G) = (*, 232.43.211.234), pinging
^C
yster5 :   unicast, xmt/rcv/%loss = 311/311/0%, min/avg/max/std-dev = 0.075/0.167/0.235/0.039
yster5 : multicast, xmt/rcv/%loss = 311/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

and yster5:
Code:
root@yster5:~# omping -c 600 -i 1 -q yster4 yster5
yster4 : waiting for response msg
yster4 : joined (S,G) = (*, 232.43.211.234), pinging
yster4 : waiting for response msg
yster4 : server told us to stop

yster4 :   unicast, xmt/rcv/%loss = 312/312/0%, min/avg/max/std-dev = 0.072/0.167/0.218/0.034
yster4 : multicast, xmt/rcv/%loss = 312/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

Is there a way to get past this and "activate" (for lack of a better term) multicast querier properly?

I have for now changed the corosync.conf to instruct totem to use udpu as transport.

My network config shows multicast querier as active.

HOWEVER, if I disable snooping, then things change.

Code:
root@yster5:~# cat /sys/devices/virtual/net/vmbr0/bridge/multicast_querier
1
root@yster4:~# cat /sys/class/net/vmbr0/bridge/multicast_snooping
0
root@yster4:~# omping -c 100 -i 1 -q yster4 yster5
yster5 : waiting for response msg
yster5 : waiting for response msg
yster5 : joined (S,G) = (*, 232.43.211.234), pinging
yster5 : given amount of query messages was sent

yster5 :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.069/0.171/0.217/0.036
yster5 : multicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.077/0.191/0.266/0.044

Code:
root@yster5:~# cat /sys/class/net/vmbr0/bridge/multicast_snooping
o
root@yster5:~# omping -c 100 -i 1 -q yster4 yster5
yster4 : waiting for response msg
yster4 : joined (S,G) = (*, 232.43.211.234), pinging
yster4 : given amount of query messages was sent

yster4 :   unicast, xmt/rcv/%loss = 100/100/0%, min/avg/max/std-dev = 0.087/0.170/0.212/0.032
yster4 : multicast, xmt/rcv/%loss = 100/99/1% (seq>=2 0%), min/avg/max/std-dev = 0.092/0.192/0.261/0.040

According to this thread, the problem should be fixed? https://unix.stackexchange.com/ques...cast-snooping-and-why-does-it-break-upnp-dlna
 
Hi,

I really don't recommend to enable querier on proxmox bridge. Because when you reboot a node, an election is done between all others nodes where querier is enabled.

Instead , enable querier and snooping on your physical switchs.

Personally, I don't setup proxmox ip on vmbr, but on a vlan tagged eth.x interface. (Like this I don't have the proxmox multicast flooded on vmbr ports)
 
I have a choice at this stage: Either I run querier on the bridge or nic, or I go unicast for now and sort it out with our provider.

Here's what happening now:

On yster4

Code:
root@yster4:~# omping -c 10000 -i 0.001 -F -q yster4 yster5 yster6
yster5 : waiting for response msg
yster6 : waiting for response msg
yster5 : joined (S,G) = (*, 232.43.211.234), pinging
yster6 : joined (S,G) = (*, 232.43.211.234), pinging
yster5 : waiting for response msg
yster5 : server told us to stop
yster6 : given amount of query messages was sent

yster5 :   unicast, xmt/rcv/%loss = 9690/9690/0%, min/avg/max/std-dev = 0.058/0.116/0.222/0.028
yster5 : multicast, xmt/rcv/%loss = 9690/9506/1% (seq>=185 0%), min/avg/max/std-dev = 0.060/0.126/0.237/0.031
yster6 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.054/0.112/0.229/0.027
yster6 : multicast, xmt/rcv/%loss = 10000/9813/1% (seq>=188 0%), min/avg/max/std-dev = 0.058/0.123/0.235/0.030

On yster5

Code:
root@yster5:~# omping -c 10000 -i 0.001 -F -q yster4 yster5 yster6
yster4 : waiting for response msg
yster6 : waiting for response msg
yster6 : joined (S,G) = (*, 232.43.211.234), pinging
yster4 : waiting for response msg
yster4 : joined (S,G) = (*, 232.43.211.234), pinging
yster6 : given amount of query messages was sent
yster4 : given amount of query messages was sent

yster4 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.058/0.117/0.296/0.028
yster4 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.060/0.128/0.299/0.031
yster6 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.052/0.114/0.293/0.029
yster6 : multicast, xmt/rcv/%loss = 10000/9942/0% (seq>=59 0%), min/avg/max/std-dev = 0.054/0.126/0.296/0.033

on yster6

Code:
root@yster6:~# omping -c 10000 -i 0.001 -F -q yster4 yster5 yster6
yster4 : waiting for response msg
yster5 : waiting for response msg
yster4 : waiting for response msg
yster5 : waiting for response msg
yster5 : joined (S,G) = (*, 232.43.211.234), pinging
yster4 : waiting for response msg
yster4 : joined (S,G) = (*, 232.43.211.234), pinging
yster5 : given amount of query messages was sent
yster4 : waiting for response msg
yster4 : server told us to stop

yster4 :   unicast, xmt/rcv/%loss = 9417/9417/0%, min/avg/max/std-dev = 0.058/0.113/0.228/0.029
yster4 : multicast, xmt/rcv/%loss = 9417/9417/0%, min/avg/max/std-dev = 0.060/0.124/0.247/0.031
yster5 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.055/0.115/0.444/0.027
yster5 : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.057/0.125/0.449/0.030

Multicast is working, yet corosync doesn't succeed in making the connection.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!