Add node to a new Cluster (Fail)

masterdaweb

Well-Known Member
Apr 17, 2017
87
4
48
32
I'm trying to add a Node to a Cluster but it always fails, I'm using 2 dedicated servers from OVH.

Take a look at the error below:

=====================================================
root@ns516885:~# pvecm add 158.69.240.50
The authenticity of host '158.69.240.50 (158.69.240.50)' can't be established.
ECDSA key fingerprint is eb:36:6f:1b:64:b8:11:df:c9:53:5c:98:a7:65:38:55.
Are you sure you want to continue connecting (yes/no)? yes
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...
=======================================================


Then when I use 'systemctl status corosync.service' to check what went wrong, it shows me:

=========================================================
root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Sun 2017-04-16 22:55:34 EDT; 4min 7s ago
Process: 3420 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration map access [0]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cmap
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration service [1]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cfg
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cpg
Apr 16 22:55:34 ns516885 corosync[3420]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 16 22:55:34 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 16 22:55:34 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 16 22:55:34 ns516885 systemd[1]: Unit corosync.service entered failed state.

========================================================
 
I did, but it is still not working.
Node appears but it stays down.

upload_2017-4-17_22-12-23.png

Error message below:

=================================
Code:
root@ns516885:~# pvecm add 158.69.240.50 -force
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...

====================================

Code:
root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
   Active: failed (Result: exit-code) since Mon 2017-04-17 21:05:14 EDT; 1min 25s ago
  Process: 7251 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cmap
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync configuration service [1]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cfg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cpg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync profile loading service [4]
Apr 17 21:05:14 ns516885 corosync[7251]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 17 21:05:14 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 17 21:05:14 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 17 21:05:14 ns516885 systemd[1]: Unit corosync.service entered failed state.

================================


corosync.conf

Code:
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: ns524364
    nodeid: 1
    quorum_votes: 1
    ring0_addr: ns524364
  }

  node {
    name: ns516885
    nodeid: 2
    quorum_votes: 1
    ring0_addr: ns516885
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: mamae
  config_version: 16
  ip_version: ipv4
  secauth: on
  transport: udpu
  version: 2
  interface {
    bindnetaddr: 0.0.0.0
    ringnumber: 0
  }

}
 
bindnetaddr looks wrong to me?
I already tried cluster IP, but I got the same error listed above.

I tried then 0.0.0.0 cause Unicast tutorial says to use it if nodes are not in the same subnet.

I really wish that it could work, I tried everything :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!