Cluster creation fails

Remuz

New Member
Apr 5, 2016
1
0
1
Running the command
Code:
pvecm create MyAwesomeCluster

Gives me the following error:
Code:
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
command 'systemctl restart corosync' failed: exit code 1

Code:
root@1 ~ # systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
   Active: failed (Result: exit-code) since Wed 2016-06-22 20:23:39 CEST; 1min 3s ago
  Process: 1783 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [SERV  ] Service engine loaded: corosync configuration map access [0]
Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [QB    ] server name: cmap
Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [SERV  ] Service engine loaded: corosync configuration service [1]
Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [QB    ] server name: cfg
Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Jun 22 20:22:39 1.de.rem.uz corosync[1790]: [QB    ] server name: cpg
Jun 22 20:23:39 1.de.rem.uz corosync[1783]: Starting Corosync Cluster Engine (corosync): [FAILED]
Jun 22 20:23:39 1.de.rem.uz systemd[1]: corosync.service: control process exited, code=exited status=1
Jun 22 20:23:39 1.de.rem.uz systemd[1]: Failed to start Corosync Cluster Engine.
Jun 22 20:23:39 1.de.rem.uz systemd[1]: Unit corosync.service entered failed state.
Code:
root@1 ~ # journalctl -xn
-- Logs begin at Wed 2016-06-22 20:19:07 CEST, end at Wed 2016-06-22 20:24:41 CEST. --
Jun 22 20:24:29 1.de.rem.uz pmxcfs[1460]: [dcdb] crit: cpg_initialize failed: 2
Jun 22 20:24:29 1.de.rem.uz pmxcfs[1460]: [status] crit: cpg_initialize failed: 2
Jun 22 20:24:35 1.de.rem.uz pmxcfs[1460]: [quorum] crit: quorum_initialize failed: 2
Jun 22 20:24:35 1.de.rem.uz pmxcfs[1460]: [confdb] crit: cmap_initialize failed: 2
Jun 22 20:24:35 1.de.rem.uz pmxcfs[1460]: [dcdb] crit: cpg_initialize failed: 2
Jun 22 20:24:35 1.de.rem.uz pmxcfs[1460]: [status] crit: cpg_initialize failed: 2
Jun 22 20:24:41 1.de.rem.uz pmxcfs[1460]: [quorum] crit: quorum_initialize failed: 2
Jun 22 20:24:41 1.de.rem.uz pmxcfs[1460]: [confdb] crit: cmap_initialize failed: 2
Jun 22 20:24:41 1.de.rem.uz pmxcfs[1460]: [dcdb] crit: cpg_initialize failed: 2
Jun 22 20:24:41 1.de.rem.uz pmxcfs[1460]: [status] crit: cpg_initialize failed: 2

I've no clue how to proceed, any help would be greatly appreciated!
 
This is probably too late for you, but I had the same problem and I solved it, so I wanted to get it posted.

Like you, "pvecm create someclustername" failed after the first few steps. The only mis-configuration I could find was that there was the wrong IP in /etc/pve/corosync.conf:
totem {
version: 2
secauth: on
cluster_name: MYCLUSTER
config_version: 1
ip_version: ipv4
interface {
ringnumber: 0
bindnetaddr: 192.168.100.2 <<<=== THIS IP WAS WRONG
}
}
...
What had happened was that I installed Proxmox without the server being connected to the network, so it didn't get a DHCP address, so the installer self-assigned one. Later, I went to /etc/network/interfaces and set vmbr0 to use DHCP, and it got its proper address on the interface, but corosync.conf kept the old, wrong address. Proxmox also did this on another server where I had to change its IP address. I wish the Proxmox folk would fix this design.

If you try to edit this file, directly, the OS won't let you because /etc/pve is mounted, through FUSE, to a SQLite3 database. So, what you do is delete the file like so (from https://forum.proxmox.com/threads/removing-deleting-a-created-cluster.18887/#post-142079):
# stop services
systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service

# Delete file in sqlite3
sqlite3 /var/lib/pve-cluster/config.db
sqlite> delete from tree where name = 'corosync.conf';
sqlite> .quit

# start services
systemctl start pve-cluster.service
systemctl start pvestatd.service
systemctl start pvedaemon.service

After doing those steps, at the least, I was back to a stand-alone server wherein I could edit hardware configs for VMs and such. Later, when I'm feeling brave, I might try to recreate a proper corosync.conf and try clustering again.
 
What had happened was that I installed Proxmox without the server being connected to the network, so it didn't get a DHCP address, so the installer self-assigned one. Later, I went to /etc/network/interfaces and set vmbr0 to use DHCP, and it got its proper address on the interface, but corosync.conf kept the old, wrong address. Proxmox also did this on another server where I had to change its IP address. I wish the Proxmox folk would fix this design.

Corosync is not based on proxmox IP, it can be different network. You probably had messed config files and with using totally default corosync command you used configurations from those config files.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!