Proxmox 3.1 - Can't add node to existing cluster

starnetwork

Renowned Member
Dec 8, 2009
422
10
83
Hi,
I just try to add new node to existing cluster (both 3.1 up-to date) and I got the next error message:
root@server202:~# pvecm add 192.168.10.121 -force
I/O warning : failed to load external entity "/etc/pve/cluster.conf"
ccs_tool: Error: unable to parse requested configuration file

command 'ccs_tool lsnode -c /etc/pve/cluster.conf' failed: exit code 1
unable to add node: command failed (ssh 192.168.10.121 -o BatchMode=yes pvecm addnode server01 --force 1)

any Idea?

Best Regards,
Star Network.
 
what is the output of:

# ccs_tool lsnode -c /etc/pve/cluster.conf

Maybe you can post the contents of /etc/pve/cluster.conf?
 
Hi,
# ccs_tool lsnode -c /etc/pve/cluster.conf:
Cluster name: cloud1, config_version: 17

Nodename Votes Nodeid Fencetype
server04 1 4
server01 1 1
server03 1 3
server02 1 2
server05 1 5

contents of /etc/pve/cluster.conf:
<?xml version="1.0"?>
<cluster name="cloud1" config_version="17">

<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>

<clusternodes>

<clusternode name="server04" votes="1" nodeid="4"/><clusternode name="server01" votes="1" nodeid="1"/><clusternode name="server03" votes="1" nodeid="3"/><clusternode name="server02" votes="1" nodeid="2"/><clusternode name="server05" votes="1" nodeid="5"/></clusternodes>

</cluster>

Best Regards,
Star Network.
 
What is the content of /etc/hostname on the node you want to add? Why do you use --force flag?
 
Content of /etc/hostname on the new node:
server06

why am using --force flag:
without this flag, I got this meesage: authentication key already exists

Best Regards,
Star Network.
 
Is the pve-cluster service running?

# service pve-cluster restart

and all files accessible at /etc/pve/ on that new node?
 
On the new node:
1. service pve-cluster restart work, answer:
root@server06:~# service pve-cluster restart
Restarting pve cluster filesystem: pve-cluster.

2. yes, /etc/pve and files inside are accessible.

Best Regards,
Star Network.
 
BTW,
1. I re-install the new node 3 times, all times the same
2. from one of the existing servers when I run the pvecm add command and I got:
unable to copy ssh ID
3. when I try to run the command from the new node and not from server that existing on cluster I got this error:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

it's can help to Analyze the problem?

Best Regards,
Star Network.
 
2. from one of the existing servers when I run the pvecm add command and I got:
unable to copy ssh ID

Oh, that is the wrong way! Never do that.

3. when I try to run the command from the new node and not from server that existing on cluster I got this error:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]
waiting for quorum...

This is likely a problem with multicast on the network/switch.

see http://pve.proxmox.com/wiki/Multicast_notes
 
Hi Dietmar,
thanks for this answer, just Question of curiosity
all servers connected to the same switch (my switch) it's Juniper EX4200
all first 5 servers supporting Multicast and only the last one no?
it's possible?

Best Regards,
Star Network.
 
all servers connected to the same switch (my switch) it's Juniper EX4200
all first 5 servers supporting Multicast and only the last one no?
it's possible?

yes, such things are possible (The difficult thing with multicast is to determine which members belongs to a group).
 
We get this issue on two different clusters. They used to work well but after an upgrade during the 3.0 phase they stopped working.

We checked using aamping and definately have working multicast. They are all in the same subnet and the switch has IGMP proxy enabled.

For now we switched to unicast, but it would be good if Proxmox could fix this issue.
 
Sir,
I have setup a 2node cluster with proxmox 3.1
and also checked the multicast through omping.
Response is coming from every nodes simultaneously.
I have added the nodes (only two node) to the cluster using pvecm add <hostname>
It shows the following messages:

Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... Timed-out waiting for cluster
[FAILED]

then I have tried to restart cman on every node
but the same message appears on every node including the master.
Nodes are showing on the master, likely they are added.
But the node color is red only master is showing green.


Please kindly help me to solve this.
Regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!