[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

joshin · Sep 17, 2019

Except I said "that's not enough for cluster ops and any sort of back end traffic"

Which is more than cluster ops alone. *eyeroll*

If the backend network link gets saturated, BAD THINGS HAPPEN TO THE CLUSTER.
And a 100Mbps OVH VRack isn't quite the same as a set of servers on the same 100Mbps switch.

If you try moving VMs from one server to another, or use the VRack for backups, you will have troubles beyond speed of the transfer with a 100Mbps VRack.

Meanwhile, why is it a bad idea to test that you're getting the proper performance on the VRack network?

bofh said:
sorry but this is not correct. it is far enough for corosync alone, just make shure you dont run your migrations over this.
corosync itself uses nothing just needs super low latency

on a 4 node cluster we have stable 150kbit down and upload on that line.
so no you do not need more than those 100mbit. i dont since i have corosync seperated and run the rest of ove traffics over the other network card.

gizmo15 · Sep 17, 2019

spirit said:
you do have something in /var/log/kern.log or #dmesg related to nic ?
nic model / driver ?
do you use ovs or linux bridge ?
do you host locally or in a public provider (ovh, heizner,....)

(maybe a public googlesheet to centralize all differents user config could help to compare setup ?)

in dmesg i have this :
[2457074.690214] traps: corosync[20684] trap divide error ip:7f210adb8f56 sp:7f20ff71da50 error:0 in libknet.so.1.3.0[7f210adad000+13000]

for corosync, for twho hosts, the nic are chelsio N320E and the other one is with BCM5716.
it's fully local, no vrack or whatever

spirit · Sep 17, 2019

Hi,
I'm able to reproduce the link down and some bad other behaviour, when the rx (ingress traffic) is saturated.
(3 nodes cluster, launch an iperf from node2 to node3, node3 is link down from node1 and I had once a corosync segfault).

This is with a gigabit link, when I'm reaching around 900-950 mbits.

no impact on outbound traffic

I have also done test with saturate the cpu, no impact.

For the inbound traffic, I can mitigate it with traffic shapping

Code:

modprobe ifb numifbs=1
ip link set dev ifb0 up
tc qdisc add dev eno1 handle ffff: ingress
tc filter add dev eno1 parent ffff: protocol ip u32 match u32 0 0 action mirred egress redirect dev ifb0
tc qdisc add dev ifb0 root handle 1: htb default 10
tc class add dev ifb0 parent 1: classid 1:10 htb rate 50mbit
tc filter add dev ifb0 parent 1: protocol ip prio 1 u32 match ip protocol 1 0xff flowid 1:10  #match icmp
tc filter add dev ifb0 parent 1: protocol ip prio 1 u32 match ip protocol 1 0xff flowid 1:10  #match udp
tc class add dev ifb0 parent 1: classid 1:20 htb rate 850mbit
tc filter add dev ifb0 parent 1: protocol ip prio 9 u32 match u8 0 0 flowid 1:20 #other traffic

replace "eno1" by your nic, or if you use vlan "eno1.XXX"

I had tried with priority queues, but it dont seem to work . But I'm not an expert with tc.

spirit · Sep 17, 2019

gizmo15 said:
in dmesg i have this :
[2457074.690214] traps: corosync[20684] trap divide error ip:7f210adb8f56 sp:7f20ff71da50 error:0 in libknet.so.1.3.0[7f210adad000+13000]

seem related to a corosync crash
do you have a generated /var/lib/corosync/fdata* on this node ?
if yes, could you send it to bugzilla : https://bugzilla.proxmox.com/show_bug.cgi?id=2326

gizmo15 · Sep 17, 2019

this folder is not present on my nodes

bofh · Sep 18, 2019

joshin said:
Except I said "that's not enough for cluster ops and any sort of back end traffic"

Which is more than cluster ops alone. *eyeroll*

please,.. first of dont poisen this topic, its to important.

second i know what im doing. im fine. since you can´t read or comprehend what i already wrote please dont try to give me advice copy pasted from the basic howtos.

since you having a hard time to comprehend, one last time, no other traffic than corosync runs there, so please stop it and leave this threat to topic

thanks.

bofh · Sep 18, 2019

spirit said:
seem related to a corosync crash
do you have a generated /var/lib/corosync/fdata* on this node ?
if yes, could you send it to bugzilla : https://bugzilla.proxmox.com/show_bug.cgi?id=2326

oups i do have fdatas but interrestingly not with creation time at crashes. also the node that usually crash has none at all while all other nodes have 1-3

spirit · Sep 18, 2019

Hi, I have open issues directly at libknet github
https://github.com/kronosnet/kronosnet/issues

as I'm able to reproduce most of the bugs (segfault, node hangs,...) under heavy network load.

spirit · Sep 18, 2019

to have more info, If you could enable debug log in corosync
corosync.conf
-------------------
logging {
debug: on
to_syslog: yes
}

and install (#apt install systemd-coredump) to have segfault coredump (/var/lib/systemd/coredump/*.lz4)

Thanks !

spirit · Sep 19, 2019

ok, good news, corosync devs are able to reproduce segfaults, hangs, and other things

If you want to follow:
https://github.com/kronosnet/kronosnet/issues/255

David Herselman · Sep 19, 2019

Just a heads up that the second filter also matches protocol 1 (icmp) only...

spirit said:

Code:

modprobe ifb numifbs=1
ip link set dev ifb0 up
tc qdisc add dev eno1 handle ffff: ingress
tc filter add dev eno1 parent ffff: protocol ip u32 match u32 0 0 action mirred egress redirect dev ifb0
tc qdisc add dev ifb0 root handle 1: htb default 10
tc class add dev ifb0 parent 1: classid 1:10 htb rate 50mbit
tc filter add dev ifb0 parent 1: protocol ip prio 1 u32 match ip protocol 1 0xff flowid 1:10  #match icmp
tc filter add dev ifb0 parent 1: protocol ip prio 1 u32 match ip protocol 1 0xff flowid 1:10  #match udp
tc class add dev ifb0 parent 1: classid 1:20 htb rate 850mbit
tc filter add dev ifb0 parent 1: protocol ip prio 9 u32 match u8 0 0 flowid 1:20 #other traffic

replace "eno1" by your nic, or if you use vlan "eno1.XXX"

I had tried with priority queues, but it dont seem to work . But I'm not an expert with tc.

bofh · Sep 19, 2019

spirit said:
ok, good news, corosync devs are able to reproduce segfaults, hangs, and other things

If you want to follow:
https://github.com/kronosnet/kronosnet/issues/255

tyvm, will follow and debuggin, spamming my logs right now but ofc no hangs yet

i also added the autorestart in systemd and see if its of any use.

Ivan Gersi · Sep 19, 2019

Confirmed. Little bottleneck in the network destroy corosync connection. I used separate core switch (the same as before) isolated from the rest network (physically) and corosync is again stable like 2.x version.
I think maybe vlan in the same switch can help too.
I`m using 10Gb UBNT switches in the backbone.

brad_mssw · Sep 19, 2019

Looks like the corosync crash should be fixed from this PR: https://github.com/kronosnet/kronosnet/pull/257

And is now part of pvetest ... anyone experiencing the corosync crash, please try it! Direct link here if not on pvetest:

http://download.proxmox.com/debian/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb

oldfart · Sep 19, 2019

Attempting to upgrade Proxmox from version 5.4-13 to 6.x
only 1 node
got to pvecm status
error:

Code:

Corosync config '/etc/pve/corosync.conf' does not exist - is this node part of a cluster?
Cannot initialize CMAP service

apt dist-upgrade does nothing!
What now?

brad_mssw · Sep 19, 2019

@oldfart please start a different thread for your issue which has nothing to do with the issue in this thread

ahovda · Sep 19, 2019

spirit said:
If you want to follow: https://github.com/kronosnet/kronosnet/issues/255

Thanks a lot! I've reverted the kernel/driver flags and installed the new patched version and after initial tests it looks very promising. There are still retransmits while the LACP bonds are renegotiating if I play with the bonding settings and such but that is to be expected.

I'm experimenting with lacp with balance-tcp (yes, docs say active-backup only) over four 1G links per host connected to a single Cisco switch with port-channel load-balance src-dst-port and I'm able to fully saturate all links between two hosts with iperf3 --client hostname --time 20 --cport 12345 --parallel 8. (the --cport is to get sequential src tcp ports so the hash algorithm spreads traffic more or less evenly.)

Still no crash or funky behavior and the cluster as a whole seems unaffected. I'll continue the testing and report back.

oldfart · Sep 19, 2019

brad_mssw said:
@oldfart please start a different thread for your issue which has nothing to do with the issue in this thread

OK - thanks

bofh · Sep 20, 2019

the crazy part is i activated debug almsot 2 days ago, i had a few rx transmission errors but no crashes and cluster is stable.
so either its a wierd coincidence or debug alone changed something lol

Fusel · Sep 20, 2019

brad_mssw said:
Looks like the corosync crash should be fixed from this PR: https://github.com/kronosnet/kronosnet/pull/257

And is now part of pvetest ... anyone experiencing the corosync crash, please try it! Direct link here if not on pvetest:

http://download.proxmox.com/debian/dists/buster/pvetest/binary-amd64/libknet1_1.11-pve2_amd64.deb

Great, frist morning without a broken cluster

Code:

Votequorum information
----------------------
Expected votes:   7
Highest expected: 7
Total votes:      7
Quorum:           4
Flags:            Quorate

[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

Renowned Member

Member

Distinguished Member

Distinguished Member

Member

Well-Known Member

Well-Known Member

Distinguished Member

Distinguished Member

Distinguished Member

Renowned Member

Well-Known Member

Renowned Member

Well-Known Member

Well-Known Member

Well-Known Member

Active Member

Well-Known Member

Well-Known Member

Member

We value your privacy