corosync show link flapping (down/up) about every 3-4 minutes, but switch shows no problem

skraw · Apr 1, 2026

Hello all,

I recently experience a problem with corosync showing link flapping, but it seems to me that these are really fake. Neither the corresponding switch shows a link problem, nor the kernels of the boxes (3-box cluster). I use a 10G fiber main links and 1G copper backup links. Flapping is shown on the copper links.
Is this kind of a timing problem with corosync?
example log:
Apr 01 17:13:13 pm-248 corosync[2090]: [KNET ] rx: host: 2 link: 1 is up
Apr 01 17:13:13 pm-248 corosync[2090]: [KNET ] link: Resetting MTU for link 1 because host 2 joined
Apr 01 17:13:13 pm-248 corosync[2090]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 01 17:13:13 pm-248 corosync[2090]: [KNET ] pmtud: Global data MTU changed to: 1397
Apr 01 17:15:26 pm-248 corosync[2090]: [KNET ] link: host: 2 link: 1 is down
Apr 01 17:15:26 pm-248 corosync[2090]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 01 17:15:28 pm-248 corosync[2090]: [KNET ] rx: host: 2 link: 1 is up
Apr 01 17:15:28 pm-248 corosync[2090]: [KNET ] link: Resetting MTU for link 1 because host 2 joined
Apr 01 17:15:28 pm-248 corosync[2090]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Apr 01 17:15:28 pm-248 corosync[2090]: [KNET ] pmtud: Global data MTU changed to: 1397

spirit · Apr 1, 2026

Is the 1gb link dedicated to corosync ? Link saturation could give flap like this.

UdoB · Apr 1, 2026

Any chance that you have an IP-address conflict? Triple-check this detail...

skraw · Apr 2, 2026

UdoB said:
Any chance that you have an IP-address conflict? Triple-check this detail...

There is definitely no address conflict.

skraw · Apr 2, 2026

spirit said:
Is the 1gb link dedicated to corosync ? Link saturation could give flap like this.

No, the fiber link is dedicated, the copper is also used for other purposes. I checked that in detail and had to find out that the latest kernel networking is not really as good as one might think - after all those years. If there is nfs traffic (ok, heavy nfs traffic) going on on the link ICMPs are lost (below 1%). I don't know how corosync checks such a link, my suspicion is that they check the interface statistics for drops and misses. Anybody knows facts?

spirit · Apr 2, 2026

skraw said:
No, the fiber link is dedicated, the copper is also used for other purposes. I checked that in detail and had to find out that the latest kernel networking is not really as good as one might think - after all those years. If there is nfs traffic (ok, heavy nfs traffic) going on on the link ICMPs are lost (below 1%). I don't know how corosync checks such a link, my suspicion is that they check the interface statistics for drops and misses. Anybody knows facts?

basically, if you have network saturation, the latency increase, and corosync remove the node from the cluster. (down/up don't mean that the physical link is flapping, it's simply that the node where you are checking the log, don't have a response enough fast from the remote node, and display it as down then up).
As it seem to be in a small interval, it could be small network spike saturation. (you should use grafana or another tool to monitor traffic each second to be sure).

skraw · Apr 2, 2026

spirit said:
basically, if you have network saturation, the latency increase, and corosync remove the node from the cluster. (down/up don't mean that the physical link is flapping, it's simply that the node where you are checking the log, don't have a response enough fast from the remote node, and display it as down then up).
As it seem to be in a small interval, it could be small network spike saturation. (you should use grafana or another tool to monitor traffic each second to be sure).

That is not the complete truth. Look at this:

--- 192.168.192.250 Ping-Statistiken ---
14000 Pakete übertragen, 13862 empfangen, 0.985714% packet loss, time 14133836ms
rtt min/avg/max/mdev = 0.103/1.065/3.615/1.127 ms

This is quite a long running ping during nfs load. If there was really heavy rising latency the max should be a lot higher than 3.6 s. I feel that the kernel does instead completely fill the interface queue with one users data (nfs) and drops other users packets if there is no spare left in the queue. I suspect there would not be much throughput deficit if he would only fill half the buffers with one users data and keep the rest for others so it does not need to drop a user with 1 small packet every second.
And if you really think about the situation. it could well be that you are just transferring some nfs-based drive from one server to another within proxmox. would you expect corosync to loose connection to the other nodes in such a case, only because the ongoing action saturates the network?
Because this would mean you can only safely use proxmox if you have a network situation with higher bandwidth than your local disks...

skraw · Apr 3, 2026

I wanted to try the problem situation with a bbr congestion variant. But I found that the kernel delivered with proxmox does not supply this congestion protocol. Why is this?

ness1602 · Apr 3, 2026

Did you try 7.0 kernel?

sudo modprobe tcp_bbr
sysctl net.ipv4.tcp_congestion_control

On 7.0 i get:
net.ipv4.tcp_available_congestion_control = reno cubic bbr

spirit · Apr 3, 2026

This is quite a long running ping during nfs load. If there was really heavy rising latency the max should be a lot higher than 3.6 s.

but you have packet loss, this is even worse. (you can look at corosync stats too

skraw said:
And if you really think about the situation. it could well be that you are just transferring some nfs-based drive from one server to another within proxmox. would you expect corosync to loose connection to the other nodes in such a case, only because the ongoing action saturates the network?

yes, definitively

skraw said:
Because this would mean you can only safely use proxmox if you have a network situation with higher bandwidth than your local disks...

the recommandations is to have dedicated links for corosync
https://pve.proxmox.com/pve-docs/chapter-pvecm.html

but note that last corosync version sur dscp protocol
https://github.com/corosync/corosync/commit/5678836caf7ff21bf0abe81fe61b092f89528665

so it could be possible to do traffic priorization on your switch if it support the protocol

skraw · Apr 3, 2026

ness1602 said:
Did you try 7.0 kernel?

sudo modprobe tcp_bbr
sysctl net.ipv4.tcp_congestion_control

On 7.0 i get:
net.ipv4.tcp_available_congestion_control = reno cubic bbr

kernel is 6.17.13-2-pve, exactly the one pve-enterprise installs.
The modules is indeed available
Thanks

skraw · Apr 4, 2026

spirit said:
but you have packet loss, this is even worse. (you can look at corosync stats too

yes, definitively

the recommandations is to have dedicated links for corosync
https://pve.proxmox.com/pve-docs/chapter-pvecm.html

View attachment 96952

but note that last corosync version sur dscp protocol
https://github.com/corosync/corosync/commit/5678836caf7ff21bf0abe81fe61b092f89528665

so it could be possible to do traffic priorization on your switch if it support the protocol

Just to make that clear again: the networt link is not saturated. Monitoring shows an average of around 400 MBit/s on a GBit interface. Which means it is quite far away from a bandwith problem. So the real question here is: why are packets lost at all? And still: how does corosync really "find out" about the problem? Shall I imagine that some of its UDP packets are lost, too?
corosync docs talk about latency around 10ms getting fishy, but I am nowhere near that either. I really think this is more a kernel/config problem. I can see no reason for packet drops here.
The switch is the last in the list that is to bother with that. it is not a matter of priority at all ...

spirit · Apr 4, 2026

skraw said:
Monitoring shows an average of around 400 MBit/s on a GBit interface.

do you have a granular monitoring (prometheus,...) which can check bandwidth every second ? don't trust average.

do you have also checked your switch port buffer stats ?

if it was a kernel bug, you'll have too on your 10gbit nic.
I'm running 100 nodes in production without any problem.

I have already seen exactly this behaviour with differents customers, and it was always network spike saturation.

skraw · Apr 5, 2026

spirit said:
do you have a granular monitoring (prometheus,...) which can check bandwidth every second ? don't trust average.

do you have also checked your switch port buffer stats ?

if it was a kernel bug, you'll have too on your 10gbit nic.
I'm running 100 nodes in production without any problem.

I have already seen exactly this behaviour with differents customers, and it was always network spike saturation.

No, I don't have a fine-grained monitoring, and really, this is not exactly the point.
I am very sure that all your working nodes are built on an empty network, just like the proxmox doc for corosync says.
The thing is this:
if I have even an 5-min-average throughput of about 400 MBit, a GBit interface must not loose packets through dropping some. This is obviously not the best handling for the situation. Instead the packet flow must be requeued in a way that the saturating "network user" must get the lowest priority in packet queuing. This would allow _all_ others (including a simple ping or corosyncs few UDP packets) to have good latency and no drops. And it is in fact the only way to keep a host in good working condition in spike (or no spike) network saturation.

spirit · Apr 5, 2026

you can use "corosync-cmapctl -m stats" on each node to have detailed stats about latencies && message retry/loss

corosync show link flapping (down/up) about every 3-4 minutes, but switch shows no problem

skraw

Well-Known Member

spirit

Distinguished Member

UdoB

Distinguished Member

skraw

Well-Known Member

skraw

Well-Known Member

spirit

Distinguished Member

skraw

Well-Known Member

skraw

Well-Known Member

ness1602

Famous Member

spirit

Distinguished Member

skraw

Well-Known Member

skraw

Well-Known Member

spirit

Distinguished Member

skraw

Well-Known Member

spirit

Distinguished Member

We value your privacy