Proxmox doesn't work after adding node to a cluster

dgallig

New Member
Feb 3, 2016
7
0
1
52
Hi all
I added a node to a cluster, and since then I have this error:
/etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/HTTPServer.pm line 1639.

As I red in the forum I tried :
#pvecm expected 1
Cannot initialize CMAP service

and
#pvecm updatecerts
no quorum - unable to update files


(syslog tail)
Mar 31 22:20:34 kvm02 pmxcfs[17309]: [quorum] crit: quorum_initialize failed: 2
Mar 31 22:20:34 kvm02 pmxcfs[17309]: [confdb] crit: cmap_initialize failed: 2
Mar 31 22:20:34 kvm02 pmxcfs[17309]: [dcdb] crit: cpg_initialize failed: 2
Mar 31 22:20:34 kvm02 pmxcfs[17309]: [status] crit: cpg_initialize failed: 2

pveversion -v
proxmox-ve: 4.1-41 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-22 (running version: 4.1-22/aca130cf)
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-2.6.32-39-pve: 2.6.32-157
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-33
qemu-server: 4.0-64
pve-firmware: 1.1-7
libpve-common-perl: 4.0-54
libpve-access-control: 4.0-13
libpve-storage-perl: 4.0-42
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-9
pve-container: 1.0-52
pve-firewall: 2.0-22
pve-ha-manager: 1.0-25
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1


Some idea?

Many thx in advance
 
Hello

me too !!

proxmox-ve: 4.1-48 (running kernel: 4.4.6-1-pve)
pve-manager: 4.1-33 (running version: 4.1-33/de386c1a)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.2.6-1-pve: 4.2.6-36
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-71
pve-firmware: 1.1-8
libpve-common-perl: 4.0-59
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-13
pve-container: 1.0-61
pve-firewall: 2.0-25
pve-ha-manager: 1.0-28
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 
what's the fuck !!!!!!!!!!!

Apr 22 19:49:02 ns3014725 corosync[8812]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 22 19:49:02 ns3014725 corosync[8812]: [QB ] server name: cpg
Apr 22 19:49:02 ns3014725 corosync[8812]: [SERV ] Service engine loaded: corosync profile loading service [4]
Apr 22 19:49:02 ns3014725 corosync[8812]: [QUORUM] Using quorum provider corosync_votequorum
Apr 22 19:49:02 ns3014725 corosync[8812]: [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
Apr 22 19:49:02 ns3014725 corosync[8812]: [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'
Apr 22 19:50:03 ns3014725 corosync[8806]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 22 19:50:03 ns3014725 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 22 19:50:03 ns3014725 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 22 19:50:03 ns3014725 systemd[1]: Unit corosync.service entered failed state.

Apr 22 20:04:57 ns3014725 pveproxy[11000]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/HTTPServer.pm line 1639.
Apr 22 20:04:59 ns3014725 pmxcfs[7982]: [quorum] crit: quorum_initialize failed: 2
Apr 22 20:04:59 ns3014725 pmxcfs[7982]: [confdb] crit: cmap_initialize failed: 2
Apr 22 20:04:59 ns3014725 pmxcfs[7982]: [dcdb] crit: cpg_initialize failed: 2
Apr 22 20:04:59 ns3014725 pmxcfs[7982]: [status] crit: cpg_initialize failed: 2

root@ns3014725:~# pveversion -v
proxmox-ve: 4.1-39 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-22 (running version: 4.1-22/aca130cf)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-39
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-36
qemu-server: 4.0-64
pve-firmware: 1.1-7
libpve-common-perl: 4.0-54
libpve-access-control: 4.0-13
libpve-storage-perl: 4.0-45
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-9
pve-container: 1.0-52
pve-firewall: 2.0-22
pve-ha-manager: 1.0-25
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie
 
Last edited:
Code:
Apr 22 19:49:02 ns3014725 corosync[8812]: [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'

What does your /etc/pve/corosync.conf and /etc/corosync.conf look like?
 
I followed the procedure described here:

on the first server that is in production for a while and that has never been a cluster, I launched this command:
hp1# pvecm create inodbox

the result of "pvecm status" :
Quorum information
------------------
Date: Sat Apr 23 08:40:32 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 16
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 5.135.x.x (local)

on the new second server i launched this command:
hp2# pvecm add 5.135.x.x

and I had the errors mentioned above.

the content of corosync.conf (same content in 2 servers):

logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: ns3006172
nodeid: 1
quorum_votes: 1
ring0_addr: ns3006172
}

node {
name: ns3014725
nodeid: 2
quorum_votes: 1
ring0_addr: ns3014725
}

}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: inodbox
config_version: 5
ip_version: ipv4
secauth: on
version: 2
interface {
bindnetaddr: 5.135.137.xxx
ringnumber: 0
}

}
 
and since my backups on the first server crashes !!
100: Apr 23 04:00:02 INFO: Starting Backup of VM 100 (lxc)
100: Apr 23 04:00:02 INFO: status = running
100: Apr 23 04:00:02 ERROR: Backup of VM 100 failed - unable to open file '/etc/pve/nodes/ns3006172/lxc/100.conf.tmp.14226' - Permission denied

I confess that I am a little tired of proxmox 4, I only have problems with respect to the prior release!
I find it extremely unstable !!!
 
You simply have no quorum (and therefor activity is blocked and you can't use /etc/pve) it looks like you use your WAN IP for the Proxmox VE cluster? Are you sure your ISP allows multicast traffic? Did you test with omping?

I have experience with Proxmox VE 3.x and 4.x, but I think PVE 4.x is much more stable, in fact I think it's rock-solid.
 
Last edited:
ok thank you @wosp
i change the ip and now the corosync in second node start, but... :-)
i have again this error in syslog:

Apr 23 10:31:55 ns3014725 pveproxy[4622]: worker 3973 finished
Apr 23 10:31:55 ns3014725 pveproxy[4622]: starting 1 worker(s)
Apr 23 10:31:55 ns3014725 pveproxy[4622]: worker 4062 started
Apr 23 10:31:55 ns3014725 pveproxy[4062]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/HTTPServer.pm line 1639.

and the result of "pvecm status" :

server 1:
Quorum information
------------------
Date: Sat Apr 23 10:33:07 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000001
Ring ID: 4
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.31.255.240 (local)

server 2 (new node):
Quorum information
------------------
Date: Sat Apr 23 10:33:15 2016
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 4
Quorate: No

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.31.255.250 (local)


thank you for your help