MTU related info spammed in log.

Shadow Sysop

Member
Mar 7, 2021
53
3
13
41
In my SysLogs, I'm seeing the following over and over, anywhere form every few minutes to every hour or so.

Apr 22 13:54:55 server9 corosync[3275313]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Apr 22 13:54:55 server9 corosync[3275313]: [KNET ] host: host: 1 has no active links
Apr 22 13:54:58 server9 corosync[3275313]: [KNET ] rx: host: 1 link: 0 is up
Apr 22 13:54:58 server9 corosync[3275313]: [KNET ] link: Resetting MTU for link 0 because host 1 joined
Apr 22 13:54:58 server9 corosync[3275313]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Apr 22 13:54:58 server9 corosync[3275313]: [KNET ] pmtud: Global data MTU changed to: 1397

So I'm wondering if there's some kind of network issue going on. This is a 4 node cluster, 3 of the nodes here and 1 remote at a remote location. The issue is with the office location here. Honestly, none of the nodes appear to be suffering from any network issues that I've seen, and definitely no downtime as we monitor closely for that. We're not using anything other than the default mtu sizes of 1500 in all our network hardware. Any thoughts are appreciated.
 
the mtu is not the problem here. mtu is computed each time a node left/join the cluster.

if you have a lot of "host x joined" , then something is wrong on node X. (nodeid X name can be found in /etc/pve/corosync.conf)
 
Odd. because on Host 4 is says 1 and 2 joined, but on node 1 and 2 it says Host 4 has joined. Host 1 and 2 are here and Host 4 is in a data center. So presumably there is an issue here. I notice host 1 is spamming Retransmits. For host 1, i tried changing nic ports, cables, switches and firewall, but it still seems to be re transmitting allot

Apr 22 21:51:52 server3 pveproxy[1642256]: Clearing outdated entries from certificate cache
Apr 22 21:51:53 server3 corosync[1546302]: [KNET ] link: host: 4 link: 0 is down
Apr 22 21:51:53 server3 corosync[1546302]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Apr 22 21:51:53 server3 corosync[1546302]: [KNET ] host: host: 4 has no active links
Apr 22 21:51:55 server3 corosync[1546302]: [KNET ] rx: host: 4 link: 0 is up
Apr 22 21:51:55 server3 corosync[1546302]: [KNET ] link: Resetting MTU for link 0 because host 4 joined
Apr 22 21:51:55 server3 corosync[1546302]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Apr 22 21:51:55 server3 corosync[1546302]: [KNET ] pmtud: Global data MTU changed to: 1397
Apr 22 21:51:56 server3 corosync[1546302]: [TOTEM ] Token has not been received in 2737 ms
Apr 22 21:52:53 server3 corosync[1546302]: [TOTEM ] Retransmit List: 48a
Apr 22 21:52:54 server3 corosync[1546302]: [TOTEM ] Retransmit List: 48b
Apr 22 21:53:03 server3 corosync[1546302]: [TOTEM ] Retransmit List: 4ad
Apr 22 21:53:13 server3 corosync[1546302]: [TOTEM ] Token has not been received in 2737 ms
Apr 22 21:53:33 server3 corosync[1546302]: [TOTEM ] Retransmit List: 4f7
 
Last edited:
The remote node averages roughly 20ms give or take, which is very appropriate as it is in Virginia and our office is in New York. I've run extended OMPings from all nodes and don't see evidence of any significant packet loss. The reason I've begun looking into this is due to weird downtimes we've experienced at the office (every few days, just a minute or 2) and we noticed this particular server, and also another machine (which has since then been taken offline) is throwing these constant retransmit errors. Furthermore, out of all the machines here, only the decommissioned one and this active server throw these errors with consistency, as the other devices here seldom retransmit.

Since taking the other machine off the network, the downtimes have now occurred much less frequently, almost down to once every couple of weeks. I'm already leaning towards decomissioning this server as well, as I have no desire to replace the NIC card as it is older hardware.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!