new server to existing cluster

naja7host

Member
May 15, 2009
78
1
6
i have setup two server in cluster successfully with unicast mode.

today i want to add a third server to the cluster but no success.

i have added all the nodes in /etc/hosts (bothin 3 servers) , the cluster is online for the two servers only

Code:
root@ns1:~# clustat
Cluster Status for clustervps @ Wed Mar 20 05:26:06 2013
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 ns1                                                            1 Online, Local
 ks2                                                            2 Online
 ks3                                                            3 Offline

root@ns1:~# pvecm status
Version: 6.2.0
Config Version: 9
Cluster Name: clustervps
Cluster Id: 42187
Cluster Member: Yes
Cluster Generation: 10712
Membership state: Cluster-Member
Nodes: 3
Expected votes: 1
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: ns206240
Node ID: 1
Multicast addresses: 255.255.255.255
Node addresses: xx.23.x.1xx

in the third node the qorum won't start

root@ks3:~# pvecm status
Version: 6.2.0
Config Version: 9
Cluster Name: clustervps
Cluster Id: 42187
Cluster Member: Yes
Cluster Generation: 10712
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: ks312315
Node ID: 3
Multicast addresses: 255.255.255.255
Node addresses: xx.165.xx.xx


root@ks3:~# pvecm expected 1


Code:
root@ks3:~# service cman start
Starting cluster:
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... [  OK  ]
   Waiting for quorum... Timed-out waiting for cluster
[FAILED]

root@ks3:~# service cman stop
Stopping cluster:
   Stopping dlm_controld... [  OK  ]
   Stopping fenced... [  OK  ]
   Stopping cman...
Timed-out waiting for cluster
[FAILED]

root@ks3:~# service apache2 restart
Syntax error on line 13 of /etc/apache2/sites-enabled/pve-redirect.conf:
SSLCertificateFile: file '/etc/pve/local/pve-ssl.pem' does not exist or is empty
Action 'configtest' failed.
The Apache error log may have more information.
 failed!

root@ks3:~# pvecm updatecerts
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
no quorum - unable to update files

root@ks3:~# pvecm expected 1

root@ks3:~# /etc/init.d/pve-cluster start
Starting pve cluster filesystem : pve-clustercan't create shared ssh key database '/etc/pve/priv/authorized_keys'
.

root@ks3:~# clustat
Cluster Status for clustervps @ Wed Mar 20 05:31:30 2013
Member Status: Inquorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 ns1                                                            1 Offline
 ks2                                                            2 Offline
 ks3                                                            3 Online, Local

root@ks3:~# pvecm add xx.23.xx.1xx -force
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
unable to copy ssh ID

in the master node i want to delete the node 3but i get this error
Code:
root@ns1:~# pvecm delnode ks3
cluster not ready - no quorum?

i HAVE REBOOTED ALL NODES .

any idea to work with it ?
 
Last edited:
Hello,

I came up against this issue once. I don't remember exactly how I solved it, but I can give you some hints.

First of all, make sure you have the line <cman keyfile="/var/lib/pve-cluster/corosync.authkey" transport="udpu" expected_votes="1"> in /etc/pve/cluster.conf, so you don't have to manually enter pvecm expected 1.

Secondly, do not include the name of the node itself in /etc/hosts, i.e., ns1 should not appear in ns1:/etc/hosts, ns2 should not appear on ns2:/etc/hosts, and so on. FQDN for each node should be defined in its local /etc/hostname.

If are not able to add a node to the cluster, you can try pvecm add IP_OF_ANY_OTHER_NODE_ALREADY_IN_CLUSTER -force (caution, this might delete all VMs on the node, so it's better to shutdown and backup any VMs previously) followed by pvecm updatecerts and /etc/init.d/apache2 start.

Let me know if this has worked for you.

Regards!
 
Thnaks for your reply and assisstance .

i get it worked , but i don't know how !!!!

i have deleted the node from the cluster and added it again . i have made so many approuch command , also i have rebooted all the node in the cluster .
also i have manually added a ssh key authentification from the new node to other node in the cluster .

next time i will record all the command/tips .
 
if you set in three node cluster

> expected_votes="1"

all your nodes got always quorum. this is VERY dangerous and really a bad idea. re-think about this and do not recommend this to anyone.
 
Yes, tom is right! If you intend to make use of HA (High Availabilty), never set expected_votes="1"! I assumed you don't intend to, but I should have asked first.

If you don't want HA, I think you can safely go with expected_votes="1". As far as I know, it's nothing but a hassle then. (Please someone correct me if I'm wrong.)

Cheers.
 
no, its not save, never do this an cluster.

See http://pve.proxmox.com/wiki/Proxmox_Cluster_file_system_(pmxcfs) - pmxcfs also needs quorum.

That is also right. I won't recommend it anymore, at least not without explaining its pros and cons.

I keep it as a personal preference for clusters without HA (no fencing!). With the default quorum setting, everything stops when cluster communication gets messed up for some reason (power outage, communication problem, simultaneous node reboots, etc.) This means complete interruption of service until manual intervention occurs. With expected_votes="1", on the other hand, there never is an interruption, since every node starts to serve as soon as it's powered on. You may get pmxcfs inconsistencies and lose sync, but that's something you can deal with calmly any time later. Provided that you are always careful enough, I think it's better to avoid service interruption as it's a huge problem.

Regards.
 
there is no service interruption if you loose quorum. VM´s or container will not be stopped.

its not that hard to setup a reliable and redundant cluster communication (use a bond, two switches).
 
there is no service interruption if you loose quorum. VM´s or container will not be stopped.

its not that hard to setup a reliable and redundant cluster communication (use a bond, two switches).

There can be service interruption in some situations. Although VMs already running on a node won't stop, any other node that doesn't find enough quorum when booting up won't be able to start any VM.

Don't get me wrong, I definitely agree that the best practice is to avoid single points of failure and use the default quorum setting. It's just that in some cases it might not be the best choice. For example, when no HA is needed and nodes are distributed accross the Internet (which means no 100% reliable communication), when power outages are frequent (power failure can never be completely avoided without spending tons of money ;)), or when you prefer to keep it simple and do things by hand.

Thanks for the clarification and sorry for the off-topic.

Cheers!
 
in the case of a cluster without HA, you always need to do manual actions in the case of a node failure.

so you can always set expected votes to 1 if you need this and you analysed what happens and what you do.

but never set this in cluster.conf in the case of a three node cluster. this is always the wrong way and will cause major issue and data lost in some situations.
 
You may get pmxcfs inconsistencies and lose sync, but that's something you can deal with calmly any time later.

Locking is another thing that will not work. So if you use any kind of shared storage you play a dangerous game.
 
HI, i am having son problem with the lose of quorum.
I have 3 node in a cluster. i have problem with the power supply and when i have a blackouts and the power come up some of the nodes not start correctly, in the log of cluster i get no quorum and the VM not start, when the other nodes start none of the VM start, please i need help to solve this isue.
Thanks
ORlando
 
HI, i am having son problem with the lose of quorum.
I have 3 node in a cluster. i have problem with the power supply and when i have a blackouts and the power come up some of the nodes not start correctly, in the log of cluster i get no quorum and the VM not start, when the other nodes start none of the VM start, please i need help to solve this isue.
Thanks
ORlando
 
there is no service interruption if you loose quorum. VM´s or container will not be stopped.

its not that hard to setup a reliable and redundant cluster communication (use a bond, two switches).
Hi Tom, sorry to hijack this years later - do you have simple walk thru for this? I am new to PM and had similar issues with node after updating to ifupdown2 and change node ip I believe. broken connection to cluster and now I am stuck with 6 nic cards not talking correctly. They show up no carrier - though I know the switch and cables are good.
Playing with pvecm said no quorum as expected - so I set to expected 1. then things started working again.. will correct after we get it back into the cluster... not sure what corrupted the node to break its connectivity... but when I get it back in I will migrate all vm on the local hdd to ceph and just reload install fresh pm onto this server... as I have 30 servers with 6 nic cards each in them I would like to setup HA an a bond for inter-server comms for data back and forth between vms - a good walk-thru would be helpful for me.. I have not really seen much intro level stuff to read up and explore.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!