You need two corosync links. For 12 nodes on gigabit I would use dedicated links for both, just in case, even if having it just for Link0 would be enough. Max I've got in production with gigabit corosync is 8 hosts, no problems at all.
The number of cluster members is an inexact limit. that ACTUAL limit has to do with how much data the cluster members have to keep synchronized- if each of your cluster members had 400vms with continuous api traffic- your cluster would probably die due to tripping timeouts. if you have 5vms mostly sitting there and not being molested you'll be just fine.My corosync network is a 1000M independent network card. Can it support a PVE cluster with 12 Node nodes?
A status update on this:For bigger clusters than that, fine-tuning might be necessary. We are currently working on guidance on how to work with bigger clusters. For the time being I would recommend to split this into smaller clusters.
/etc/pve/corosync.conf [3] and add inside the totem section the option token_coefficient, for example token_coefficient: 125 which will lower the token coefficient to 125ms. Of course you can also set other values (as noted above, token_coefficient: 325 should be enough for the clusters in the range of 30-40 nodes) or go back to the default. Don't forget to increase config_version [3]. You need to restart corosync on each node for the change to fully take effect. EDIT 2026-01-08: If you're using HA, you may want to disarm HA [1] before the config change, and re-enable it afterwards.Hi,Hi, did anybody test a custom (lower) token coefficient and wants to share their observations?
May 12 18:21:12 pve02 corosync[3419]: [KNET ] link: host: 3 link: 0 is down
May 12 18:21:15 pve02 corosync[3419]: [KNET ] link: host: 1 link: 0 is down
May 12 18:21:17 pve02 corosync[3419]: [KNET ] link: host: 1 link: 1 is down
May 12 18:21:17 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 1 (pri: 1)
May 12 18:21:18 pve02 corosync[3419]: [TOTEM ] Token has not been received in 2392 ms
May 12 18:21:20 pve02 corosync[3419]: [TOTEM ] A processor failed, forming new configuration: token timed out (3125ms), waiting 3750ms for consensus.
May 12 18:21:22 pve02 corosync[3419]: [KNET ] link: host: 3 link: 1 is down
May 12 18:21:24 pve02 corosync[3419]: [TOTEM ] Process pause detected for 1744 ms, flushing membership messages.
May 12 18:21:26 pve02 corosync[3419]: [TOTEM ] Token has not been received in 9567 ms
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 1 has no active links
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 1 has no active links
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 1 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 has no active links
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 1 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 has no active links
May 12 18:21:35 pve02 corosync[3419]: [QUORUM] Sync members[1]: 2
May 12 18:21:35 pve02 corosync[3419]: [QUORUM] Sync left[2]: 1 3
May 12 18:21:35 pve02 corosync[3419]: [TOTEM ] A new membership (2.103) was formed. Members left: 1 3
May 12 18:21:35 pve02 corosync[3419]: [TOTEM ] Failed to receive the leave message. failed: 1 3
May 12 18:21:35 pve02 corosync[3419]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
May 12 18:21:35 pve02 corosync[3419]: [QUORUM] Members[1]: 2
May 12 18:21:35 pve02 corosync[3419]: [MAIN ] Completed service synchronization, ready to provide service.
May 12 18:21:35 pve02 corosync[3419]: [KNET ] rx: host: 1 link: 1 is up
May 12 18:21:35 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 1 joined
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 1 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] pmtud: Global data MTU changed to: 1397
May 12 18:21:35 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 0 is up
May 12 18:21:35 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:21:35 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 1 is up
May 12 18:21:35 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 3 joined
May 12 18:21:35 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:21:36 pve02 corosync[3419]: [KNET ] rx: host: 1 link: 0 is up
May 12 18:21:36 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
May 12 18:21:36 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:21:36 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:21:36 pve02 corosync[3419]: [QUORUM] Sync members[3]: 1 2 3
May 12 18:21:36 pve02 corosync[3419]: [QUORUM] Sync joined[2]: 1 3
May 12 18:21:36 pve02 corosync[3419]: [TOTEM ] A new membership (1.10b) was formed. Members joined: 1 3
May 12 18:21:36 pve02 corosync[3419]: [QUORUM] This node is within the primary component and will provide service.
May 12 18:21:36 pve02 corosync[3419]: [QUORUM] Members[3]: 1 2 3
May 12 18:21:36 pve02 corosync[3419]: [MAIN ] Completed service synchronization, ready to provide service.
May 12 18:22:06 pve02 corosync[3419]: [KNET ] pmtud: Global data MTU changed to: 1397
May 12 18:22:25 pve02 corosync[3419]: [TOTEM ] Token has not been received in 2354 ms
May 12 18:23:09 pve02 corosync[3419]: [KNET ] link: host: 3 link: 1 is down
May 12 18:23:09 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:09 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:10 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 1 is up
May 12 18:23:10 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 3 joined
May 12 18:23:10 pve02 corosync[3419]: [KNET ] pmtud: Global data MTU changed to: 1397
May 12 18:23:10 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:10 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:48 pve02 corosync[3419]: [KNET ] link: host: 3 link: 0 is down
May 12 18:23:49 pve02 corosync[3419]: [KNET ] link: host: 3 link: 1 is down
May 12 18:23:50 pve02 corosync[3419]: [KNET ] link: host: 1 link: 0 is down
May 12 18:23:51 pve02 corosync[3419]: [KNET ] link: host: 1 link: 1 is down
May 12 18:23:51 pve02 corosync[3419]: [TOTEM ] Token has not been received in 2443 ms
May 12 18:23:51 pve02 corosync[3419]: [TOTEM ] A processor failed, forming new configuration: token timed out (3125ms), waiting 3750ms for consensus.
May 12 18:23:55 pve02 corosync[3419]: [KNET ] rx: host: 1 link: 1 is up
May 12 18:23:55 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 1 joined
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] Sync members[1]: 2
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] Sync left[2]: 1 3
May 12 18:23:55 pve02 corosync[3419]: [TOTEM ] A new membership (2.10f) was formed. Members left: 1 3
May 12 18:23:55 pve02 corosync[3419]: [TOTEM ] Failed to receive the leave message. failed: 1 3
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] Members[1]: 2
May 12 18:23:55 pve02 corosync[3419]: [MAIN ] Completed service synchronization, ready to provide service.
May 12 18:23:55 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 0 is up
May 12 18:23:55 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
May 12 18:23:55 pve02 corosync[3419]: [KNET ] rx: host: 1 link: 0 is up
May 12 18:23:55 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
May 12 18:23:55 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 1 is up
May 12 18:23:55 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 3 joined
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] Sync members[1]: 2
May 12 18:23:55 pve02 corosync[3419]: [TOTEM ] A new membership (2.113) was formed. Members
May 12 18:23:55 pve02 corosync[3419]: [QUORUM] Members[1]: 2
May 12 18:23:55 pve02 corosync[3419]: [MAIN ] Completed service synchronization, ready to provide service.
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:23:55 pve02 corosync[3419]: [KNET ] pmtud: Global data MTU changed to: 1397
May 12 18:23:57 pve02 corosync[3419]: [TOTEM ] Token has not been received in 2359 ms
May 12 18:23:58 pve02 corosync[3419]: [QUORUM] Sync members[1]: 2
May 12 18:23:58 pve02 corosync[3419]: [TOTEM ] A new membership (2.11b) was formed. Members
May 12 18:23:58 pve02 corosync[3419]: [QUORUM] Sync members[3]: 1 2 3
May 12 18:23:58 pve02 corosync[3419]: [QUORUM] Sync joined[2]: 1 3
May 12 18:23:58 pve02 corosync[3419]: [TOTEM ] A new membership (1.11f) was formed. Members joined: 1 3
May 12 18:23:58 pve02 corosync[3419]: [QUORUM] This node is within the primary component and will provide service.
May 12 18:23:58 pve02 corosync[3419]: [QUORUM] Members[3]: 1 2 3
May 12 18:23:58 pve02 corosync[3419]: [MAIN ] Completed service synchronization, ready to provide service.
May 12 18:24:45 pve02 corosync[3419]: [KNET ] link: host: 3 link: 1 is down
May 12 18:24:46 pve02 corosync[3419]: [KNET ] link: host: 1 link: 0 is down
May 12 18:24:46 pve02 corosync[3419]: [KNET ] link: host: 1 link: 1 is down
May 12 18:24:47 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:24:47 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:24:47 pve02 corosync[3419]: [KNET ] host: host: 1 has no active links
May 12 18:24:48 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:24:48 pve02 corosync[3419]: [KNET ] host: host: 1 has no active links
May 12 18:24:48 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
May 12 18:24:48 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:24:48 pve02 corosync[3419]: [KNET ] rx: host: 3 link: 1 is up
May 12 18:24:48 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 3 joined
May 12 18:24:48 pve02 corosync[3419]: [KNET ] rx: host: 1 link: 1 is up
May 12 18:24:48 pve02 corosync[3419]: [KNET ] link: Resetting MTU for link 1 because host 1 joined
May 12 18:24:48 pve02 corosync[3419]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
May 12 18:24:48 pve02 corosync[3419]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
May 12 18:24:48 pve02 corosync[3419]: [TOTEM ] Retransmit List: 141
May 12 18:24:48 pve02 corosync[3419]: [KNET ] pmtud: Global data MTU changed to: 1397
May 12 18:26:40 pve02 corosync[3419]: [TOTEM ] Token has not been received in 2449 ms
May 12 18:26:54 pve02 corosync[3419]: [KNET ] link: host: 3 link: 0 is down
May 12 18:27:00 pve02 corosync[3419]: [TOTEM ] A processor failed, forming new configuration: token timed out (3125ms), waiting 3750ms for consensus.
May 12 18:27:04 pve02 corosync[3419]: [KNET ] link: host: 3 link: 1 is down
3 nodes (HPE DL360 Gen10)could you give more details about your setup? network, hardware, ..? for three nodes the coefficient makes almost no difference in practice..
We use essential cookies to make this site work, and optional cookies to enhance your experience.