can't add again node of failed server

starnetwork

Renowned Member
Dec 8, 2009
429
10
83
Hi,
I was Cluster with few nodes,
one of the servers in the Cluster crushed and I removed it manually from the Cluster by
pvecm delnode server202

after that, I reinstall this node after I got new hard drive and I try to do
root@server204:~# pvecm add 192.168.10.31
authentication key already exists
I added: -f then I got
root@server204:~# pvecm add 192.168.10.31 -f
root@192.168.10.31's password:
unable to copy ssh ID

any advice?

Regards,
 
Hello,
to clarify 192.168.10.31 is a node which is already in the cluster and "server204" is your reinstalled node which you want to add?

Did you restore any backed up data on your reinstalled node or is it a fresh installed one?
 
1. no, server204 is already in the cluster and 192.168.10.31 is teh "reinstalled" new node

2. I didn't restored any data on 192.168.10.31, it's fresh proxmox installation.

Regards,
 
Then you have to do the the other way around.
Log in to the reinstalled server (192.168.10.31) bring it up to date (so it has same package versions like the cluster nodes) and execute:
Code:
pvecm add <ip_address_to_a_cluster_node>

e.g if there exists a node 192.168.15.30 which is already in the cluster, run:
Code:
pvecm add 192.168.15.30
 
Thanks again!
you right, but how can I fix it?
it's look like this server is connected in the internal network
root@server202:~# traceroute 192.168.10.101
traceroute to 192.168.10.101 (192.168.10.101), 30 hops max, 60 byte packets
1 192.168.10.101 (192.168.10.101) 0.181 ms 0.178 ms 0.175 ms
root@server202:~# ping 192.168.10.101
PING 192.168.10.101 (192.168.10.101) 56(84) bytes of data.
64 bytes from 192.168.10.101: icmp_req=1 ttl=64 time=0.219 ms
64 bytes from 192.168.10.101: icmp_req=2 ttl=64 time=0.167 ms
64 bytes from 192.168.10.101: icmp_req=3 ttl=64 time=0.155 ms
^C
--- 192.168.10.101 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.155/0.180/0.219/0.029 ms
root@server202:~# omping 192.168.10.101
omping: Can't find local address in arguments
root@server202:~#
 
HI Udo,
1. none of my server is not in 192.168.15.x but only 192.168.10.x, why should I run it for 192.168.15.x?
2. on both server I should run same command, no
# on server202
omping 192.168.10.31 192.168.10.101

# on the other server
omping 192.168.10.101 192.168.10.31
3. can I run it only in both new (re-build) and the existing server I connect to it, or I should run in in all the cluster nodes?
4. can you please explain me more about the omping, why we need it?
 
HI Udo,
1. none of my server is not in 192.168.15.x but only 192.168.10.x, why should I run it for 192.168.15.x?
2. on both server I should run same command, no

3. can I run it only in both new (re-build) and the existing server I connect to it, or I should run in in all the cluster nodes?
4. can you please explain me more about the omping, why we need it?
Hi,
sorry - I looked in the other post for the second IP and found the 192.168.15.x in the post from Thomas. Don't reconiced, that this was an example only.

Use your right IPs.

Udo
 
I did,
open 3 ssh windows
1. for server204, running
omping 192.168.10.101 192.168.10.31
2. for server202, first running
omping 192.168.10.31 192.168.10.101
both synced, now second running:
pvecm add 192.168.10.101 -f
got

Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

Any advice?
 
I did,
open 3 ssh windows
1. for server204, running
omping 192.168.10.101 192.168.10.31
2. for server202, first running
omping 192.168.10.31 192.168.10.101
both synced, now second running:
pvecm add 192.168.10.101 -f
got

Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

Any advice?
Hi,
how looks your cluster-config?

On an running cluster-member:
Code:
cat /etc/pve/cluster.conf
pvecm stat
and on your server202
Code:
cat /etc/pve/cluster.conf
pvecm stat
Udo
 
for server202
root@server202:~# pvecm stat
Version: 6.2.0
Config Version: 24
Cluster Name: cloud1
Cluster Id: 6501
Cluster Member: Yes
Cluster Generation: 12
Membership state: Cluster-Member
Nodes: 1
Expected votes: 4
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: server202
Node ID: 1
Multicast addresses: InternetIP
Node addresses: InternetIP

for server204:
root@server204:~# pvecm stat
Version: 6.2.0
Config Version: 26
Cluster Name: cloud1
Cluster Id: 6501
Cluster Member: Yes
Cluster Generation: 113200
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: server204
Node ID: 4
Multicast addresses: InternetIP
Node addresses: InternalIP


*** the second node -> Node address is InternalIP and not InternetIP ***
cluster file is existing and same for both
 
for server202


for server204:



*** the second node -> Node address is InternalIP and not InternetIP ***
cluster file is existing and same for both
Hi,
it's hard work to help you if you don't provide all info.

In my last post I ask for the output of /etc/pve/cluster.conf.
I'm sure you know your setup - i don't. And without info I can't help you, but perhaps other in this forum...

Strange is
Code:
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Active subsystems should the same like Nodes (but not more).

Take a look in your hosts-files and in /etc/pve/priv/authorized_keys (check that the keys fit).


Bye

Udo
 
Hi Udo,
sorry, here is the full details:
server202, the new Cluster member:
Cluster.conf file
<?xml version="1.0"?>
<cluster name="cloud1" config_version="26">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>


<clusternode name="server204" votes="1" nodeid="4"/><clusternode name="server205" votes="1" nodeid="5"/><clusternode name="server206" votes="1" nodeid="6"/><clusternode name="server202" votes="1" nodeid="1"/></clusternodes>


</cluster>

pvecm stat:
Version: 6.2.0
Config Version: 24
Cluster Name: cloud1
Cluster Id: 6501
Cluster Member: Yes
Cluster Generation: 12
Membership state: Cluster-Member
Nodes: 1
Expected votes: 4
Total votes: 1
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: server202
Node ID: 1
Multicast addresses: 88.75.25.126
Node addresses: 5.9.211.228

server204, Existing Cluster Member:
Cluster.conf file
<?xml version="1.0"?>
<cluster name="cloud1" config_version="26">


<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>


<clusternodes>


<clusternode name="server204" votes="1" nodeid="4"/><clusternode name="server205" votes="1" nodeid="5"/><clusternode name="server206" votes="1" nodeid="6"/><clusternode name="server202" votes="1" nodeid="1"/></clusternodes>


</cluster>

pvecm stat:
Version: 6.2.0
Config Version: 26
Cluster Name: cloud1
Cluster Id: 6501
Cluster Member: Yes
Cluster Generation: 113200
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: server204
Node ID: 4
Multicast addresses: 88.75.25.126
Node addresses: 192.168.10.101

Regards,