Adding to a 2 server cluster: Waiting for quorum... Timed-out waiting for cluster

gutter007 · Feb 11, 2014

Issues Adding two servers to an existing Cluster

We have two servers already in a cluster, which was originally configured as a two node cluster. It is in production and has 10 VMs running on each.:
aurora

Code:

root@aurora:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2

Code:

root@aurora:~# pvecm status
Version: 6.2.0
Config Version: 6
Cluster Name: dii-san-cluster
Cluster Id: 24244
Cluster Member: Yes
Cluster Generation: 408
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: aurora
Node ID: 1
Multicast addresses: 239.192.94.19
Node addresses: 10.10.40.10

Code:

blooming
root@blooming:~# pveversion -v
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-3 (running version: 3.1-3/dc0e9b0e)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2

Code:

root@blooming:~# pvecm status
Version: 6.2.0
Config Version: 6
Cluster Name: dii-san-cluster
Cluster Id: 24244
Cluster Member: Yes
Cluster Generation: 408
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: blooming
Node ID: 2
Multicast addresses: 239.192.94.19
Node addresses: 10.10.40.11

We have two new server ready to add to a cluster, which has a newer version:
strongest. They do not have any VMs on them yet.

Code:

root@strongest:~# pveversion -v
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1

Code:

pvecm status
Version: 6.2.0
Config Version: 6
Cluster Name: dii-san-cluster
Cluster Id: 24244
Cluster Member: Yes
Cluster Generation: 48
Membership state: Cluster-Member
Nodes: 2
Expected votes: 4
Total votes: 2
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: strongest
Node ID: 3
Multicast addresses: 239.192.94.19
Node addresses: 10.10.10.78

bolivar

Code:

root@bolivar:~# pveversion -v
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1

Code:

root@bolivar:~# pvecm status
Version: 6.2.0
Config Version: 6
Cluster Name: dii-san-cluster
Cluster Id: 24244
Cluster Member: Yes
Cluster Generation: 48
Membership state: Cluster-Member
Nodes: 2
Expected votes: 4
Total votes: 2
Node votes: 1
Quorum: 3 Activity blocked
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: bolivar
Node ID: 4
Multicast addresses: 239.192.94.19
Node addresses: 10.10.10.79

At some point along the line I changed the expected to 1, to add the new servers. However, after the change. I see the following when I run the pvecm nodes command.

Code:

aurora/blooming

Node  Sts   Inc   Joined               Name
   1   M    408   2014-02-11 14:57:00  aurora
   2   M    396   2014-02-11 14:54:27  blooming
   3   X      0                        strongest
   4   X      0                        bolivar

Code:

strongest/bolivar
root@strongest:~# pvecm nodes
Node  Sts   Inc   Joined               Name
   1   X      0                        aurora
   2   X      0                        blooming
   3   M     44   2014-02-11 15:15:40  strongest
   4   M     48   2014-02-11 15:16:04  bolivar

and I'm not longer able to access strongest/bolivar directly via the web interface. I can access aurora/blooming via the web gui, but when I click on the strongest/bolivar nodes I get the a broken pipe 596 error.

I also reboot strongest, and got the following in the boot log:

Code:

Tue Feb 11 15:15:39 2014: Starting cluster:
Tue Feb 11 15:15:39 2014:    Checking if cluster has been disabled at boot... [  OK  ]
Tue Feb 11 15:15:39 2014:    Checking Network Manager... [  OK  ]
Tue Feb 11 15:15:39 2014:    Global setup... [  OK  ]
Tue Feb 11 15:15:39 2014:    Loading kernel modules... [  OK  ]
Tue Feb 11 15:15:39 2014:    Mounting configfs... [  OK  ]
Tue Feb 11 15:15:39 2014:    Starting cman... [  OK  ]
Tue Feb 11 15:15:44 2014:    Waiting for quorum... Timed-out waiting for cluster
Tue Feb 11 15:16:28 2014: [FAILED]

I have tested multicasting, and it is enabled on our router.

Does anyone have a way to get all 4 servers on the same cluster, hopefully without rebooting aurora/blooming.

thanks.
myles.

udo · Feb 11, 2014

Hi myles,
you don't wrote something about your netmask. With an /24 -netmask the nodes are not in the same subnet (10.10.40.0 + 10.10.10.0) which is nessesary.
But perhaps you have an /16-netmask?!

Udo

gutter007 · Feb 11, 2014

I think that's the problem.

All of the servers are connected to 4 different vlans.

I want the cluster to be on the management lan 10.10.40.*, like strongest and bolivar.

However, it took the 10.10.10.* ip by default. Is there a way to define the ip you want the server to use when adding it to a cluster? I don't see an option for that in pvecm.

thanks.
myles.

udo · Feb 11, 2014

gutter007 said:
I think that's the problem.

All of the servers are connected to 4 different vlans.

I want the cluster to be on the management lan 10.10.40.*, like strongest and bolivar.

However, it took the 10.10.10.* ip by default. Is there a way to define the ip you want the server to use when adding it to a cluster? I don't see an option for that in pvecm.

thanks.
myles.

Hi,
normaly the Ip of vmbr0 is used for clustercommunication (this IP which is used during the installation and also wrote in /etc/hosts...
AFAIK if you don't have vmbr0 the IP of eth0 is used...

BTW, why you have in all subnets IPs? normaly it's enough to have an bridge for additionals subnets without IPs.

Udo

enna2000 · Jul 22, 2014

Waiting for quorum... Timed-out waiting for cluster
Tue Feb 11 15:16:28 2014: [FAILED]

Modify the /etc/init.d/cman configuration file

vim /etc/init.d/cman

CMAN_QUORUM_TIMEOUT=45

Instead of

CMAN_QUORUM_TIMEOUT=0

And then restart cman.

enna2000 · Jul 22, 2014

修改/etc/init.d/cman配置文件
vim /etc/init.d/cman
CMAN_QUORUM_TIMEOUT=45
改为
CMAN_QUORUM_TIMEOUT=0
然后重启cman即可

Search

Search

Adding to a 2 server cluster: Waiting for quorum... Timed-out waiting for cluster

gutter007

New Member

udo

Distinguished Member

gutter007

New Member

udo

Distinguished Member

enna2000

New Member

enna2000

New Member