Should redundant corosync links be on seperate VLAN?

nCursed

New Member
Feb 10, 2025
2
0
1
After reading through PVE CM Redundancy, I am left wondering whether or not there is any reason not to use the same VLAN for the two separate physical corosync links? I see the example shows what presumably are separate subnets, but I see no explicit mention on whether or not such a logical separation is critical for a well functioning redundant corosync cluster or not. I see only that the links should be physically separate.
 
As long as the VLAN only see's corosync traffic, and at least one interface is dedicated to corosync (as link0) you should be good.

I have setup clusters with the dedicated interface as link0 (tagged to the corosync vlan) and the fallback (link1) is on the bonded trunk interface that carries VM traffic. You don't want any latency/jitter on the corosync link, so hence the need for a dedicated physical interface.
 
Thank you for replying!

Not sure if I understand your answer correctly, or perhaps I wasn't clear in my question.

I understand corosync should be seperate from all other traffic. But I am wondering whether it is important that the two corosync links are also seperated logically (eg. corosync-vlan1 & corosync-vlan2), or whether connecting the two (physically) seperate links to the exact same vlan is permitted / good practice.

I couldn't find an answer on this in the docs, other than how it only refers to the importance of the two links being physically seperate, and the example showing what is presumably two different logical networks.

Thanks again
 
Sorry for necroposting but I want to add some considerations for anyone stumbling onto this thread in the future:

Physical redundancy/separation (separate NICs, cables, switches etc.) is probably the most important part of a redundant cluster setup.

However, logical separation (mainly VLANs at layer 2 and subnets at layer 3, as was mentioned here) is likely also a good idea. Apart from improving organisation and potentially making administration/troubleshooting easier, separate VLANs/subnets for each cluster link can mitigate risks of some L2/L3 failure scenarios, for example loops, or other switch-level issues that would otherwise affect the entire logical network and thus the whole cluster. This is not a comment on how likely such scenarios are but just a thought to consider.

Bottom line: If you want proper redundancy, make sure it is implemented on all levels, not just at the physical level. Since every PVE deployment is different, one would have to weigh the pros and cons of each additional layer of redundancy against eachother. Omitting a layer always comes with risks and exactly how big that risk is has to be assessed individually, ideally before such a decision is made.

And I agree, this topic (and cluster redundancy as a whole) could be expanded upon a bit more in the documentation, especially given how important redundant cluster networks are for proper deployments.