Would you mind giving a short feedback on what went wrong on your end? I have a weird situation where I usually get an UPID, .. but SOMETIMES I do get an empty string, but the task was still successfully queued by Proxmox. The problem is that in this case I never "know" when the task is finished...
Corosync 3 Coredump from today:
http://www.blackmesa.at/resources/core.corosync.0.c04d10f2e780497981683c0105059fad.37980.1567776827000000.lz4
hope that helps :)
How did I NOT think of this ... *sigh* ... thanks!
I'm going to do that after taking libknet1-1.11-pve1 for a spin.
That being said, I installed libknet1-1.11-pve1 from non-sub around 12 hours ago and the
cluster did not fall apart yet, which is the longest period of time corosync3 survived...
Same here. Disabled IPv6 on the interfaces - still the same random crashes. Today I upgraded to libknet 1.10-pve2 from the enterprise repos, let's see if that changes anything.
For us (running 6.0-5 with corosync 3.0.2-pve2) this already was the (default?) setting
and we have the stability issues nevertheless.
But I'll try and disable IPv6 on the interface - let's see if that helps.
Have to move the cluster interface to a dedicated NIC then, but that's probably
a good...
Hard to say. We're on PVE6.0-5 with the same issue. We now introduced a really weird workaround where we use cron to:
1) start corosync every 5 minutes
2) restart pve-cluster - then let everything settle and let do proxmox its "job"
3) stop corosync again
4) start over in 4 minutes.
It's a...
@RokaKen: besides the SEGV that happened yesterday, the whole issue is the same it seems, yes.
Thanks for pointing that out - I stumbled upon the issue when I started to hunt down the coroync SEGV - but when the problems got worse this night I did not remember the thread.
I keep this open for...
@dietmar: Thanks for reaching out. I followed your advice and opened a case on https://bugzilla.proxmox.com/show_bug.cgi?id=2326 a few minutes ago.
For the sake of visibility and documentation:
The problem seems to include KNET, at least tonight the cluster disintegrated again (without...
Hey guys,
we updated from PVE5 to PVE6 recently and noticed that nodes on our 4-node cluster leave randomly. Checking pvecm status states that CMAP cannot be initialized, so I had a look at corosync on the failed node only to learn that it obviously segfaulted.
This happened on 3 of 4 cluster...
Sorry for digging up this old thread, but we experience the exact same situation.
The cluster consists of 4 nodes.
Node 1 only shows cluster members 1, 2 and 3
All the other nodes show all cluster members (1, 2, 3 and 4).
As you might have guessed, node 4 was added last.
The problem is...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.