Fail to add a node to a cluster (2 nodes total)

artkill · Jun 6, 2016

Guys,

I'm fighting with this during the last 2 days without any success. The situation up to now is as follows:

1. I've had 2 nodes joined in a cluster and I needed to remove 1 of them and join another machine in its place, because of a hardware upgrade. The 2 nodes use 1 OpenVPN tunnel between them, 10.8.0.2 (active) and 10.8.0.1 (removed)
2. I've removed the node with pve delnode xxx and after that on the remained active I issued
pvecm expected 1
3. after that I installed a brand new Proxmox under Debian Wheezy and tried to join the cluster on the other side:
pvecm add 10.8.0.2
using this guide:
https://pve.proxmox.com/wiki/Proxmox_VE_2.0_Cluster

The result is as follows:

root@de-ber ~ # pvecm add 10.8.0.2
copy corosync auth key
stopping pve-cluster service
Stopping pve cluster filesystem: pve-cluster.
backup old database
Starting pve cluster filesystem : pve-cluster.
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... [ OK ]
Starting fenced... [ OK ]
Starting dlm_controld... [ OK ]
Tuning DLM kernel config... [ OK ]
Unfencing self... [ OK ]
waiting for quorum...

OK, apparently I need some help with this. Cheers!

PS: I attach 3 FILES with the additional details to the thread! Multicast ping added as well.

artkill · Jun 7, 2016

Any help will be appreciated!

t.lamprecht · Jun 10, 2016

Hi, does reliable multicast works over the network for a longer time? Please check https://pve.proxmox.com/wiki/Troubl...luster_issues#Diagnosis_from_first_principles

Interesting would be the output from:

Code:

omping -c 10000 -i 0.001 -F -q <list of all nodes IPs>

# the next on takes ~ 10 min but is important:
omping -c 600 -i 1 -q <list of all nodes IPs>

Edit: Oh and you have to start the above commands on all nodes.

Further post the output from

Code:

pveversion -v

from both nodes

artkill · Jun 10, 2016

Thanks for your response!
I've had attached the info as attachment in my post, but anyway initially the 2 nodes were:
root@bg-sof:/etc/pve# pveversion
pve-manager/3.1-21/93bf03d4 (running kernel: 2.6.32-26-pve)

root@de-ber ~/.ssh # pveversion
pve-manager/3.4-13/4c745357 (running kernel: 2.6.32-46-pve)

Now, after I bought a subscription key and upgraded the situation is as follows:
root@bg-sof:/var/log/cluster# pveversion
pve-manager/3.4-13/4c745357 (running kernel: 2.6.32-46-pve)

root@de-ber ~ # pveversion
pve-manager/3.4-13/4c745357 (running kernel: 2.6.32-46-pve)

root@bg-sof:/var/log/cluster# omping -c 10000 -i 0.001 -F -q 10.8.0.2 10.8.0.1
10.8.0.1 : joined (S,G) = (*, 232.43.211.234), pinging
10.8.0.1 : waiting for response msg
10.8.0.1 : server told us to stop
10.8.0.1 : unicast, xmt/rcv/%loss = 9108/9013/1%, min/avg/max/std-dev = 42.563/43.157/84.611/2.151
10.8.0.1 : multicast, xmt/rcv/%loss = 9108/9013/1%, min/avg/max/std-dev = 42.619/43.211/84.641/2.152

root@de-ber ~ # omping -c 10000 -i 0.001 -F -q 10.8.0.1 10.8.0.2
10.8.0.2 : waiting for response msg
10.8.0.2 : joined (S,G) = (*, 232.43.211.234), pinging
10.8.0.2 : given amount of query messages was sent
10.8.0.2 : unicast, xmt/rcv/%loss = 10000/9893/1%, min/avg/max/std-dev = 42.566/43.131/86.040/1.709
10.8.0.2 : multicast, xmt/rcv/%loss = 10000/9893/1%, min/avg/max/std-dev = 42.632/43.168/86.130/1.708

Before the node hardware upgrade, the old cluster was running about 2 years quite stable without any issues, so the multicast shouldn't be an issue.

root@bg-sof:/var/log/cluster# omping -c 600 -i 1 -q 10.8.0.2 10.8.0.1
10.8.0.1 : waiting for response msg
10.8.0.1 : joined (S,G) = (*, 232.43.211.234), pinging
10.8.0.1 : given amount of query messages was sent
10.8.0.1 : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 42.867/43.500/51.583/0.781
10.8.0.1 : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 42.886/43.541/51.609/0.780

root@de-ber ~ # omping -c 600 -i 1 -q 10.8.0.1 10.8.0.2
10.8.0.2 : waiting for response msg
10.8.0.2 : joined (S,G) = (*, 232.43.211.234), pinging
10.8.0.2 : given amount of query messages was sent
10.8.0.2 : unicast, xmt/rcv/%loss = 600/599/0%, min/avg/max/std-dev = 42.849/43.475/60.508/0.942
10.8.0.2 : multicast, xmt/rcv/%loss = 600/599/0%, min/avg/max/std-dev = 42.874/43.517/60.548/0.942

t.lamprecht · Jun 10, 2016

I think the latency is way to big, so think also the corosync devs...(The re transmit list probably also tells that)
But that aside, did you have the IP from the new node in /etc/hosts, or did you update it to the new one if it changed at all?

Are you sure the pvecm delnode worked, i.e. you had quorum at the time you issued it.
The authkeys should be the same yes.

I would propose you dismantle the cluster as it seems the deletion of the old node was not quite working and it may be seen as the old one, and recreate it. Make a copy of /pve/etc just to be sure and then do something like:

Code:

rm /etc/pve/cluster.conf
service cman stop
service pve-cluster stop
rm /var/lib/pve-cluster/corosync.authkey
rm /var/lib/corosync/ringid_*
service pve-cluster start
# then on one node
pvecm create clustername
# on the other, if you have VMs there use --force
pvecm add IP

If you already setting this stuff up you may think about upgrading to 4.2, newer stack, fixed bugs and I take less time to remember the commands

artkill · Jun 10, 2016

Hello again,

Could you give me more details on how to upgrade to 4.2, I'm running Debian 7 Wheezy on both the servers? Thanks.

artkill · Jun 10, 2016

Also, could you give me all the steps on how to delete the cluster configuration as the above steps don't seem to work much. Thanks!

artkill · Jun 10, 2016

One more details: I just found that on the old node I've had the csync2 installed, but it's missing on the new upgraded server:

root@bg-sof:/etc/init.d# dpkg -l |grep csync2
rc csync2 1.34-2.2+b1 amd64 cluster synchronization tool

Could it be the case here, what do you think?

PS: Anyway, I've installed it and the same story continues..

t.lamprecht · Jun 13, 2016

artkill said:
Also, could you give me all the steps on how to delete the cluster configuration as the above steps don't seem to work much. Thanks!

I updated my steps above and tested them, had the order from service command wrong, please test

And report any errors encountered, to be able to help if it fails again

artkill said:
One more details: I just found that on the old node I've had the csync2 installed, but it's missing on the new upgraded server:

csync2 is not a part of the procxmox stack, we use just corosync and pve-cluster for clustering, so no this cannot be the fault.

For the upgrade questions see:
https://pve.proxmox.com/wiki/Upgrade_from_3.x_to_4.0

artkill · Jun 13, 2016

Hi again,

No problem about the "service" commands, that was obvious

However, the problem for the upgrade from 3.x to 4.0 is that as I see it needs Debian Jessie, but I cannot upgrade, since I'm using OpenMediaVault and it's totally incompatible. Anyway, for me it's obvious that the multicast is running fine (I'm using PIMD to achieve this), moreover the ping b/w the nodes is only 48ms, stable and was running on a similar server, maybe with 7-8ms less latency, for quite a long time. The only difference was that at the time I was creating the cluster it was not the 2.6.32-46-pve but the 2.6.32-26-pve kernel.

So, in fact it will be great if there's a was to somehow troubleshoot what the corosync problem is.

t.lamprecht · Jun 13, 2016

artkill said:
Hi again,

No problem about the "service" commands, that was obvious However, the problem for the upgrade from 3.x to 4.0 is that as I see it needs Debian Jessie, but I cannot upgrade, since I'm using OpenMediaVault and it's totally incompatible.

OK, that's a deal breaker, then we should focus on getting your old setup to work.

Did you tried my updated commands, the servcie change was not the only one, another deletion of a corosync key (not strictly necessary but "just to be sure") and s/corosync/cman/ changed.

artkill said:
Anyway, for me it's obvious that the multicast is running fine (I'm using PIMD to achieve this), moreover the ping b/w the nodes is only 48ms, stable and was running on a similar server, maybe with 7-8ms less latency, for quite a long time. The only difference was that at the time I was creating the cluster it was not the 2.6.32-46-pve but the 2.6.32-26-pve kernel.

So, in fact it will be great if there's a was to somehow troubleshoot what the corosync problem is.

That's quite surprising to me, to be honest, I find it cool if it really worked for you but also almost a little strange, to quote a corosync dev about using it in different subnets [1]:

However, really, don't do this. Corosync, just like all other cluster
communications layers, is very sensitive to network latencies, and
unless you work for someone like the CIA or CERN or NASA, or a large
backbone provider, you're not going to have access to multiple physical
locations connected with near-LAN latency.

I don't know PIMD, apart from recognizing its name, you could also try to use unicast (see man corosync.conf) as with a two node setup this would not have the disadvantages it else has.

To narrow down the problem:

* Your old node is now away and offline (or wiped and online, but not online in the same state as a PVE node!)
* You used delnode to delete the old node from the remaining cluster, then added the new one.
* It almost instantly failed with a short retransmit list problem and then a failure of the cluster configuration.

So can you please retry the updated steps from me from above and ensure that the old, now gone node, is not online meaning its still it the cluster or something similar, do those steps on both nodes, i.e. dissassemble the whole cluster and recreate it from scratch. I can do that on my testbed without affecting uptime, but I have also a little more ideal setup (LAN connection between nodes).

[1] http://lists.corosync.org/pipermail/discuss/2011-October/000099.html

artkill · Jun 13, 2016

Well, I'm sorry looks like I was not clear enough

I use OpenVPN between the nodes and route the multicast traffic over the tunnel by using PIMD. The old node is turned off, as the VPN link between them. I tried your suggestion several times, tried to join the first server to 2nd and reverse way -- the same problem with re-transmissions. I removed all the packages and made a clean install on them (e.g. removed all configs like /etc/pve and /var/lib/pve-cluster, /etc/cluster as well). Apparently, there's some reason for the issue out of the possible logical explanation here. I even tried to
make the /etc/pve filesystem writable: pmxcfs -l and added tweaks like:

<totem window_size="170"/>
<totem token="54000"/>
<multicast addr="224.0.2.1"/>

to the cluster config, unfortunately without any success. Regarding the broadcast way (udpu) -- I've read somewhere it's quite inefficient, isn't it?

t.lamprecht · Jun 13, 2016

artkill said:
Well, I'm sorry looks like I was not clear enough I use OpenVPN between the nodes and route the multicast traffic over the tunnel by using PIMD. The old node is turned off, as the VPN link between them. I tried your suggestion several times, tried to join the first server to 2nd and reverse way -- the same problem with re-transmissions. I removed all the packages and made a clean install on them (e.g. removed all configs like /etc/pve and /var/lib/pve-cluster, /etc/cluster as well).

You don't have some small test system you can use and try the same with two completely new setup PVE hosts, this is a little desperate, but I could imagine that with the out of sync state shown from your log/status message that something is not fully cleaned up. Hard to tell.

artkill said:
Apparently, there's some reason for the issue out of the possible logical explanation here.

Seems a little so, as if it worked before with this setup I see no reason why it shouldn't anymore.
I can imagine that it does not work in general with such a setup but as you said it worked this does not apply here.

Without access to the cluster (which I cannot do without at least basic subscription, work policy) I cannot say/help here that much, I'm afraid.

artkill said:
Regarding the broadcast way (udpu) -- I've read somewhere it's quite inefficient, isn't it?

Yes for a node count > 2, as with multicast a message gets transmitted to all nodes without sending it more than once, with unicast you have to send it each node directly, meaning more transmissions and network load if you have more than 2 nodes. If you have only two nodes those problems do not effect you as each node needs to send only one other member, so load is the same if using multicast or unicast (but only in the two node case).

Search

Search

Fail to add a node to a cluster (2 nodes total)

artkill

New Member

Attachments

artkill

New Member

t.lamprecht

Proxmox Staff Member

artkill

New Member

t.lamprecht

Proxmox Staff Member

artkill

New Member

artkill

New Member

artkill

New Member

t.lamprecht

Proxmox Staff Member

artkill

New Member

t.lamprecht

Proxmox Staff Member

artkill

New Member

t.lamprecht

Proxmox Staff Member

We value your privacy