Which network should use for Corosync?

adrian.jiang · Mar 25, 2020

Hi,

Currently, our Proxmox corosync in on public IP. Is this a best practice? A few articles mentioning about
'RING' while discussing corosync.

# corosync-cfgtool -s
Printing link status.
Local node ID 5
LINK ID 0
addr = 117.xxx.x.x
status:
nodeid 1: link enabled:1 link connected:1
nodeid 2: link enabled:1 link connected:1
nodeid 3: link enabled:1 link connected:1
nodeid 4: link enabled:1 link connected:1
nodeid 5: link enabled:1 link connected:1
nodeid 6: link enabled:1 link connected:1

Below is a result I got from an online article (which specifying RING)

[root@pcmk-1 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 192.168.122.101
status = ring 0 active with no faults

Is our configuration seems risky? Why the command ran on our proxmox installation not showing RING ID? In fact what the term RING really mean here?

Thanks in advance

aaron · Mar 25, 2020

Best practice is to have Corosync on a dedicated physical network that is just used for Corosync.
A 1GBit network is more than fast enough. Corosync doesn't need a lot of bandwidth but it really needs low latency.

If you have Corosync running on a network with other traffic, especially anything storage-related like NFS, Ceph, Backup, iSCSI,... you can easily run into the situation that the other traffic is congesting the network. This, in turn, increases the latency for the corosync packets and in a worst-case scenario the cluster will "fall apart" until the corosync services on each node can reach the others in a timely manner again.
Should you have any HA guests active on your nodes you will have the problem that they will fence themselves after 2 minutes without being part of the quorate cluster. If the whole cluster fell apart, this means that each node with HA guests on it will fence itself.

Additional rings, or links (it's the same in the context of corosync) increase the redundancy. Corosync will switch to another link if the main one cannot be used anymore. Corosync 3 (used since PVE 6.x) supports up to 8 links.

t.lamprecht · Mar 25, 2020

Hi,

all corosync traffic is encrypted with an authkey only known by the cluster members (it is exchanged on join), so from a security stand point it doesn't really matter where it runs. Albeit public networks could be DDOS'd, so from a reliability and availability stand point it can be better to run it on a private network/LAN.

"Ring" or nowadays often also called "link" are a way of corosync to use more than one network for communicating, this allows to fallback to another if one network fails.

In general the most important thing is that the network on which corosync runs isn't used by IO traffic, as this can disrupt corosync easily. While corosync isn't using much bandwidth it really is sensitive to latency (spikes)

See also: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network (and the rest of that chapter)

adrian.jiang · Mar 26, 2020

Thank you guys for your explanation.

Regarding RING/LINK - If we want to have another ring/link, how should we do it?

t.lamprecht · Mar 27, 2020

adrian.jiang said:
Regarding RING/LINK - If we want to have another ring/link, how should we do it?

Currently that still needs to make one hands a bit dirty, as it's only available over CLI:
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_adding_redundant_links_to_an_existing_cluster

We're working on a web-interface implementation of this, though.

mscd · Jan 8, 2022

Dear Mr. Lamprecht,

I found this thread due to my researches to the following problem/question:

I would like to build a Proxmox Cluster on basis of dedicated root servers from Hetzner. The nodes can be connected by Hetzners vSwitch-technology (full virtual Layer-2 connections). Due to the fact, that vSwitch-traffic does not get encrypted by default, I would like to know, if ALL cluster traffic is already secured by encryption of Proxmox itself?
As you mentioned above, corosync traffic is secured by encryption by default. So is it „safe“ to run the cluster network of Proxmox without any additional encryption (over an unsecured network)?
Unfortunately I can find no further information to this subject (cluster network encryption).

Best regards,
mscd

Tmanok · Feb 12, 2023

t.lamprecht said:
all corosync traffic is encrypted with an authkey only known by the cluster members (it is exchanged on join), so from a security stand point it doesn't really matter where it runs.

Hey Thomas,

Similar question to MSCD:

We're looking at the future viability of using DarkFibre to connect an 8 node cluster across two cities (~378KM). We have calculated that the latency should be low enough and we will have authority over the fibre link with the exception that some points will be switched and relayed by the ISP. We want to know if there is written documentation about the Corosync Traffic being encrypted- nothing in the corosync documentation specifies that it is encrypted or how it is encrypted: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network

Thanks, let me know if I missed something, if not could someone on the team please update the documentation?

Tmanok

spirit · Feb 12, 2023

Tmanok said:
Hey Thomas,

Similar question to MSCD:

We're looking at the future viability of using DarkFibre to connect an 8 node cluster across two cities (~378KM). We have calculated that the latency should be low enough and we will have authority over the fibre link with the exception that some points will be switched and relayed by the ISP. We want to know if there is written documentation about the Corosync Traffic being encrypted- nothing in the corosync documentation specifies that it is encrypted or how it is encrypted: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_cluster_network

Thanks, let me know if I missed something, if not could someone on the team please update the documentation?

Tmanok

you should be careful if you want to stretch 1 cluster between 2cities .Gf you have a fibercut between the 2 cities, you'll have quorum lost in 1 of the city.
you need a quorum nodesomewhere in a third city, with a different network link between each city.

Not sure with 380km latency will be ok. (I think it shoud be around 5-6ms ?). Seem a bit high.

In my opinion, you should build 2 differents cluster on each city.

Tmanok · Feb 14, 2023

spirit said:
Not sure with 380km latency will be ok. (I think it shoud be around 5-6ms ?). Seem a bit high.

Hi Spirit,

Having a witness node in another DC is a good idea. That being said, the latency on the fibre will be 3.8ms (1ms per 100KM) given it would be clear of all other traffic. The recommendation in the wiki is around 5ms and below 10ms so if we can make it work then I will be happy.

We are looking at 2x fibre routes so that if one is cut the other will be in a completely different location. The latency may vary a bit more or less like 3ms to 5ms depending on the route. We want instant failover, so we are looking hard at the fibre options. The key now is security.

Thanks, our current plan is 4x clusters with four nodes each, two clusters at each site but extra-cluster replication does not exist as a feature yet and we want to meet SLAs.

Tmanok

spirit · Feb 14, 2023

Tmanok said:
Tmanok said:

Hi Spirit,

Having a witness node in another DC is a good idea. That being said, the latency on the fibre will be 3.8ms (1ms per 100KM) given it would be clear of all other traffic. The recommendation in the wiki is around 5ms and below 10ms so if we can make it work then I will be happy.

We are looking at 2x fibre routes so that if one is cut the other will be in a completely different location. The latency may vary a bit more or less like 3ms to 5ms depending on the route. We want instant failover, so we are looking hard at the fibre options. The key now is security.

Thanks, our current plan is 4x clusters with four nodes each, two clusters at each site but extra-cluster replication does not exist as a feature yet and we want to meet SLAs.

Tmanok

Click to expand...

I have a customer running a cluster, with 3 nodes on 2 different DC with 1ms and an extra 7th node for quorum at 10ms. It's working. I'm not sure with bigger cluster.

Note that for storage replication, if you want really instant failover, you'll need syncronous replication, so don't expect a lot of iops with 4ms if you're application don't do too much parallelism or sequential writes (database journal,....) (1ms = 1000iops , 4ms = 250iops).

Tmanok · Feb 14, 2023

spirit said:
I have a customer running a cluster, with 3 nodes on 2 different DC with 1ms and an extra 7th node for quorum at 10ms. It's working. I'm not sure with bigger cluster.

Note that for storage replication, if you want really instant failover, you'll need syncronous replication, so don't expect a lot of iops with 4ms if you're application don't do too much parallelism or sequential writes (database journal,....) (1ms = 1000iops , 4ms = 250iops).

The replication (PVE based asynchronous) will take place over VPLS links separate from the leased fibre cluster links but those are very good estimates for storage iops per latency. Unfortunately, our VPLS links are currently 8-13ms (62.5 iops by your estimate) we are planning to reduce core network latency to improve this but you've made a good point about how cautious we need to be about replication. Thank you for those insights, I honestly overlooked iops per ms of ping. With say 20 to 40 core VMs replicating we will want frequent replication- I don't have a good estimate right now for how many IOPS that would take. Note DBs will replicate separately at the application layer. I would be interested to hear what you use for synchronous replication (PVE or CEPH or SAN replication).

Back to the main point, Thomas please have someone update those docs or let me know whether corosync encryption it is written down in a technical document.

Thank you,

Tmanok

t.lamprecht · Feb 14, 2023

FYI, for even-numbered node counts you could add a QDevice, which doesn't participates as full cluster node, but only as vote arbiter and thus doesn't has the same latency requirements.
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

Tmanok said:
Back to the main point, Thomas please have someone update those docs or let me know whether corosync encryption it is written down in a technical document.

Well what do you need to know exactly?

Basically we set the corosync.conf secauth option to on, which implies AES 256 encryption:
https://manpages.debian.org/bullseye/corosync/corosync.conf.5.en.html#secauth

Search

Search

Which network should use for Corosync?

adrian.jiang

Member

aaron

Proxmox Staff Member

t.lamprecht

Proxmox Staff Member

adrian.jiang

Member

t.lamprecht

Proxmox Staff Member

mscd

Member

Tmanok

Renowned Member

spirit

Distinguished Member

Tmanok

Renowned Member

spirit

Distinguished Member

Tmanok

Renowned Member

t.lamprecht

Proxmox Staff Member