upgarding 2 nodes to corosync3

Kevin Myers · Jul 29, 2019

Hi,

Im having a peculiar issue.

My nodes used infiniband in connected mode for cluster network, and storage (different interfaces) .

The cluster network runs on a 10.1,100.x range, and the ipoib device looks like
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520
inet 10.1.100.2 netmask 255.255.255.0 broadcast 10.1.100.255

the other node is 10.1.100.1 with the same mtu.

in the corrosync config I have

nodelist {
node {
name: kvm1
nodeid: 1
quorum_votes: 1
ring0_addr: 10.1.100.1
}
node {
name: kvm2
nodeid: 2
quorum_votes: 1
ring0_addr: 10.1.100.2
}

and

interface {
bindnetaddr: 10.1.100.1
ringnumber: 0
}

Now the problem is pvecm status shows only the local node on each of the nodes, and spamming to the syslog on both nodes I see :

Jul 29 11:16:10 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:10 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:10 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:10 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:10 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:11 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 11:16:12 kvm2 corosync[4025739]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.

This setup worked perfectly on corrosync2.

Do you think the MTU of 65520 is in some way not compatible with corosync3 ? or is there any other gems anyone could suggest ??

All the best
Kevin M

mir · Jul 29, 2019

Try changing netmtu to see whether this fixes your problem. Default for netmtu is 1500.

Kevin Myers · Jul 29, 2019

I had tried earlier fiddling with netmtu, but decided to try setting some different values, (1500, 20000 , 65520) , all of which make nod difference we end up with

Jul 29 14:40:14 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:14 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:14 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:14 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:15 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.
Jul 29 14:40:16 kvm1 corosync[4039025]: [KNET ] pmtud: Aborting PMTUD process: Too many attempts. MTU might have changed during discovery.

spamming the logs

ive created an issue with at https://github.com/kronosnet/kronosnet/issues/241 as this does seem to be a bug in my opinion when looking at the code https://github.com/kronosnet/kronosnet/blob/master/libknet/threads_pmtud.c

best regards
Kevin M

fabian · Jul 30, 2019

the kronosnet devs already found two bugs (one in knet, one in the kernel) with your specific setup. as a quick workaround, lowering the MTU on the actual interface to something lower than 65484 should allow pmtud to work correctly. once there is a fix on the code side, we'll integrate it into our kronosnet packages and you should be able to bump the MTU again.

Kevin Myers · Aug 1, 2019

Hi Fabian,

It seems a patch have been applied to fix this, could it be integrated

Best regards
Kev

fabian · Aug 2, 2019

we are in close contact with upstream, once the fix is finalized both we and upstream will release fixed packages ASAP

fabian · Aug 2, 2019

you are probably following the upstream issue anyway, but packages with the proposed fix from upstream included are available on pvetest:

Code:

http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet-dev_1.10-pve2_amd64.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet-doc_1.10-pve2_all.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1-dbgsym_1.10-pve2_amd64.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libknet1_1.10-pve2_amd64.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libnozzle-dev_1.10-pve2_amd64.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libnozzle1-dbgsym_1.10-pve2_amd64.deb
http://download.proxmox.com/debian/pve/dists/buster/pvetest/binary-amd64/libnozzle1_1.10-pve2_amd64.deb

(corosync needs to be restarted for the fixed packages to be used)

Kevin Myers · Aug 2, 2019

Hi Fabian,

The patches did fix the MTU issue, however another ugly issue has raised its head .

I have a cluster of around 30 nodes, ( I know all the warnings of corosync2 and 16 nodes however this worked awesomely on corrosync2) .

I started to move nodes from CS2 to CS3 in prep of the 5-6 move, and all seemed good to start with,, after node 5 convergence started slowing down, when I got to node 17 the node didnt join ring 0 and created a new ring 20, (problem 1) at this point CS3 on the previous 16 nodes was consuming 200-400% cpu on the hardware nodes (problem 2)

Not sure the best path to report these issue, you do seem to have 'contacts' , any advise is appreciated.

best regards
Kev

fabian · Aug 5, 2019

Kevin Myers said:
Hi Fabian,

The patches did fix the MTU issue, however another ugly issue has raised its head .

I have a cluster of around 30 nodes, ( I know all the warnings of corosync2 and 16 nodes however this worked awesomely on corrosync2) .

I started to move nodes from CS2 to CS3 in prep of the 5-6 move, and all seemed good to start with,, after node 5 convergence started slowing down, when I got to node 17 the node didnt join ring 0 and created a new ring 20, (problem 1) at this point CS3 on the previous 16 nodes was consuming 200-400% cpu on the hardware nodes (problem 2)

Not sure the best path to report these issue, you do seem to have 'contacts' , any advise is appreciated.

best regards
Kev

could you provide the corosync.conf and network setup, and describe how you did the upgrade?

what do you mean with "didn't join ring0 and created a new ring 20" ?

as always, logs would also be great, at least of pve-cluster and corosync services..

edit: and was this upgrade done with the patched libknet, or the regular one from the corosync-3 repo?

Search

Search

upgarding 2 nodes to corosync3

Kevin Myers

Member

mir

Famous Member

Kevin Myers

Member

fabian

Proxmox Staff Member

Kevin Myers

Member

fabian

Proxmox Staff Member

fabian

Proxmox Staff Member

Kevin Myers

Member

fabian

Proxmox Staff Member