[SOLVED] problems with corosync and quorum

r4a5a88

Renowned Member
Jun 15, 2016
63
3
73
36
hi

i am in the process of upgrading my proxmox cluster to jessie
and cannot add a node
I keep getting , that my corosync fails to start and this :

Jun 21 11:32:29 pro-06-dmed pmxcfs[17721]: [status] crit: cpg_initialize failed: 2
Jun 21 11:32:29 pro-06-dmed pmxcfs[17721]: [dcdb] crit: cpg_initialize failed: 2
Jun 21 11:32:29 pro-06-dmed pmxcfs[17721]: [confdb] crit: cmap_initialize failed: 2
Jun 21 11:32:29 pro-06-dmed pmxcfs[17721]: [quorum] crit: quorum_initialize failed: 2
Jun 21 11:32:23 pro-06-dmed pmxcfs[17721]: [status] crit: cpg_initialize failed: 2
Jun 21 11:32:23 pro-06-dmed pmxcfs[17721]: [dcdb] crit: cpg_initialize failed: 2
Jun 21 11:32:23 pro-06-dmed pmxcfs[17721]: [confdb] crit: cmap_initialize failed: 2
Jun 21 11:32:23 pro-06-dmed pmxcfs[17721]: [quorum] crit: quorum_initialize failed: 2
Jun 21 11:32:17 pro-06-dmed pmxcfs[17721]: [status] crit: cpg_initialize failed: 2
Jun 21 11:32:17 pro-06-dmed pmxcfs[17721]: [dcdb] crit: cpg_initialize failed: 2
Jun 21 11:32:17 pro-06-dmed pmxcfs[17721]: [confdb] crit: cmap_initialize failed: 2
Jun 21 11:32:17 pro-06-dmed pmxcfs[17721]: [quorum] crit: quorum_initialize failed: 2
Jun 21 11:32:11 pro-06-dmed pmxcfs[17721]: [status] crit: cpg_initialize failed: 2
Jun 21 11:32:11 pro-06-dmed pmxcfs[17721]: [dcdb] crit: cpg_initialize failed: 2
Jun 21 11:32:11 pro-06-dmed pmxcfs[17721]: [confdb] crit: cmap_initialize failed: 2
Jun 21 11:32:11 pro-06-dmed pmxcfs[17721]: [quorum] crit: quorum_initialize failed: 2
Jun 21 11:32:06 pro-06-dmed systemd[1]: Unit corosync.service entered failed state.
Jun 21 11:32:06 pro-06-dmed systemd[1]: Failed to start Corosync Cluster Engine.
Jun 21 11:32:06 pro-06-dmed systemd[1]: corosync.service: control process exited, code=exited status=1
Jun 21 11:32:06 pro-06-dmed corosync[17730]: Starting Corosync Cluster Engine (corosync): [FAILED]

in my logs
I tried reinstalling this Server twice
now with an new IP
I entered pvecm e 1 a bunch of times and restarted the cluster

what should I do??
 
it seems corosync can't find the private key under

/etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/HTTPServer.pm line 1626.

I added the node with pvecm add xxx.xxx.xxx.xxx --force
it seems to not create the nodes directory
can you create it all manually ?
 
thx , yes I did
I think 4.X Clusters don't have a clusterconf any more
I couldn't find one on any other node

the other thing thats happend was :
when I installed the Node all Folders were there ( in the the /etc/pve/local dir )
but after I while adding the node to the cluster , the folders an files were deleted

-r--r----- 1 root www-data 159 Jan 1 1970 .clusterlog
-r--r----- 1 root www-data 573 Jun 22 08:19 corosync.conf
-rw-r----- 1 root www-data 2 Jan 1 1970 .debug
lr-xr-xr-x 1 root www-data 0 Jan 1 1970 local -> nodes/pro-06-dmed
lr-xr-xr-x 1 root www-data 0 Jan 1 1970 lxc -> nodes/pro-06-dmed/lxc
-r--r----- 1 root www-data 44 Jan 1 1970 .members
lr-xr-xr-x 1 root www-data 0 Jan 1 1970 openvz -> nodes/pro-06-dmed/openvz
lr-xr-xr-x 1 root www-data 0 Jan 1 1970 qemu-server -> nodes/pro-06-dmed/qemu-server
-r--r----- 1 root www-data 216 Jan 1 1970 .rrd
-r--r----- 1 root www-data 383 Jan 1 1970 .version
-r--r----- 1 root www-data 18 Jan 1 1970 .vmlist

this is how the directory looks like
the folder nodes is missing
I cannot recreate it with mkdir as root
 
this is my main cluster node
pveversion -v
proxmox-ve: 4.2-54 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-15 (running version: 4.2-15/6669ad2c)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.10-1-pve: 4.4.10-54
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-42
qemu-server: 4.0-81
pve-firmware: 1.1-8
libpve-common-perl: 4.0-68
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-55
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-19
pve-container: 1.0-68
pve-firewall: 2.0-29
pve-ha-manager: 1.0-32
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie

this is the node I am triying to add

proxmox-ve: 4.2-54 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-15 (running version: 4.2-15/6669ad2c)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.10-1-pve: 4.4.10-54
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-42
qemu-server: 4.0-81
pve-firmware: 1.1-8
libpve-common-perl: 4.0-68
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-55
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-19
pve-container: 1.0-68
pve-firewall: 2.0-29
pve-ha-manager: 1.0-32
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 
I solved it
the servers were not in the same subnet
the network was parted in 3 subnets
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!