Corosync 3.x: Multicast (for now) obsolete, use of Unicast (or knet) is reccomended

harvie

Well-Known Member
Apr 5, 2017
138
23
58
35
I've been discussing this with corosync developers and they've told me this:
https://github.com/corosync/corosync/issues/465

TLDR:
  • Multicast was only reccomended for corosync 1.x, because unicast was not tested yet
  • For corosync 2.x, they reccomend to use unicast (Proxmox currently uses corosync 2.4.4-dirty)
  • There is new corosync 3.x, which obsoletes udp/udpu and instead uses new "kronosnet" ("knet") unicast protocol, that is even faster and more reliable, has lower latency and can use up to 8 redundant network links (rather than 2 rendundant rings of rrp used in corosync 2.x)
  • Also it's reccomended to maintain full list of nodes in corosync.conf, with corosync 3.x it is even mandatory... (not sure if proxmox does this by default right now... maybe it does)
So it seems to me that "strong" reccomendation for multicast mentioned in proxmox wiki is obsolete information that has it's roots back in the corosync 1.x days... So if you ever have troubles with multicast, don't worry and go anycast. According to corosync folks it's perfectly OK even performace-wise.

Also i am looking forward for corosync 3.x implementation in PVE
 
Last edited:
Multicast in 2.X is not obsoleted, from our side nor from upstream.

There is new corosync 3.x, which obsoletes udp/udpu and instead uses new "kronosnet" ("knet") unicast protocol, that is even faster and more reliable, has lower latency and can use up to 8 redundant network links (rather than 2 rendundant rings of rrp used in corosync 2.x)

In the next major PVE release (6.x) we'll highly probably use corosync 3.0 ant thus switch over to knet - but our internal testing has shown that it's not just all new and shiny good, but it has some issues too. Some areas it works better than multicast but some not. We're currently in contact with upstream to bring this in better shape. In some edge scenarios we got into some scenarios far from ideal with knet, e.g., cold boots of big clusters, which we do not experience with udp multicast (as long as the switch is not misconfigured or has bugs with multicast)

Also it's reccomended to maintain full list of nodes in corosync.conf, with corosync 3.x it is even mandatory... (not sure if proxmox does this by default right now... maybe it does)

it does, since long (IIRC, since PVE 4.0 for sure and 3.4 just had another conf format (XML), but there too).

So it seems to me that "strong" reccomendation for multicast mentioned in proxmox wiki is obsolete information that has it's roots back in the corosync 1.x days... So if you ever have troubles with multicast, don't worry and go anycast. According to corosync folks it's perfectly OK even performace-wise.

IMO this needs to be looked at this way: multicast is in theory far more ideal, but it's niche and thus quite a few switches and network setups make it problematic - that is the issue with corosync multicast. As long as the network itself has a good multicast support (which sadly isn't always as easy to determine) it also work good. And this was already told you so by Jan Friesse (one of the great maintainers behind corosync)
... recommending udpu, because of huge amount of support requests mostly related to bad/incorrectly configured switches

I.e., configuration issue and sometimes switch HW/FW issue, not multicast issue. In fact, there are even plans to bring knet to multicast too in the future.

I'm changing your thread title a bit for this reason, as IMO it does seem a bit click-baity, hope you do not take this bad.

Also i am looking forward for corosync 3.x implementation in PVE

We're onto it, but it's not straight forward as we would like and initially thought.
 
I understand the point, but to me it does seem there's absolutely nothing wrong with using unicast. At least for clusters with less than 1000 nodes and maybe even then the performance gain might not be that huge... It might be more user friendly if proxmox just stoped pushing users to multicast setup, which is not always easy on newbies...

I know the multicast is like holy grail of ideal proxmox unicorn cluster communication :) But from my experience unicast is just more reliable if you don't have time, will and skill to debug your network... Unicast is just easy way to get cluster of reasonable size running. On the other hand with multicast it took me like 20 minutes after installation until i managed to unrecovarably split the cluster in two (well i've been blindly following guide and lowered expected votes and didn't raised them back up, because at the time i didn't knew what i was doing...). So i think it would make sense to make unicast a default setup.

Maybe there can even be checkbox for advanced users to easily switch between unicast and multicast... Or even autodetection with automatic fallback to unicast, but that's probably over-engineering :-D
 
At least for clusters with less than 1000 nodes and maybe even then the performance gain might not be that huge...

less than 25, you mean (and that on fast HW (2 x 10GBps) solely reserved for corosync traffic)? When looking at their papers knet was mostly tested with 4 nodes, so the 64k limit has to be taken rather theoretical.
https://github.com/kronosnet/kronosnet/issues/218

But yeah, smaller clusters may run just well with unicast, I can look into re-formulating the docs a bit to make them a bit less "opinionated" against multicast, e.g., noting that cluster with 5-7 nodes can be run with common network hardware, at least if corosync gets it's own network, or guaranteed latency < 2-3 ms..

well i've been blindly following guide and lowered expected votes and didn't raised them back up, because at the time i didn't knew what i was doing...).

You can do that from the start one, even through our WebUI (at least since PVE 5.1), which does all the setup for you and also a few basic checks, catching some easy to make issues (all would be better, but not easy to check for automatically).
Also, which guide did you follow, as: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_rrp_on_existing_clusters does not requires lowering expected votes... Also, if there are more votes than expected corosync automatically increases them, so this should be a non-issue to begin with.

I know the multicast is like holy grail of ideal proxmox unicorn cluster communication :) But from my experience unicast is just more reliable if you don't have time

IMO, 20 minutes of reading docs and maybe trying things out in a testing system first - the literal very first time one is doing something with clustering (a very complex thematic) and/or PVE, is not very much to ask.
It can be a bit much if you just want to tinker and run, but not if you want to start running a production system.

Maybe there can even be checkbox for advanced users to easily switch between unicast and multicast...

Switching is not a easy to do automatically (in a high error free way), without knowing the network topology (which we cannot easily find out), we try to avoid such complicated things as it's an easy way to destruct things.
We rather try the documentation way for this things, so that each admin can adapt it to their own setup, test it out (e.g., in a virtual, nested, cluster) and then roll it out.
 
Multicast in 2.X is not obsoleted, from our side nor from upstream.

Is this statement still valid in 2024?

But yeah, smaller clusters may run just well with unicast, I can look into re-formulating the docs a bit to make them a bit less "opinionated" against multicast

It is kind of sad to see the categorical statement on top of:
https://pve.proxmox.com/wiki/Multicast_notes

In the next major PVE release (6.x) we'll highly probably use corosync 3.0 ant thus switch over to knet - but our internal testing has shown that it's not just all new and shiny good, but it has some issues too. Some areas it works better than multicast but some not. We're currently in contact with upstream to bring this in better shape. In some edge scenarios we got into some scenarios far from ideal with knet, e.g., cold boots of big clusters, which we do not experience with udp multicast (as long as the switch is not misconfigured or has bugs with multicast)

I tried to look for some of this in archives, but to no avail. Did I miss something (major) or it still holds true that e.g. cold boot large cluster is better off with multicast, everything else being the same?

Is there any plan to support out of band state transfers in PVE?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!