Add node to a new Cluster (Fail)

masterdaweb · Apr 17, 2017

I'm trying to add a Node to a Cluster but it always fails, I'm using 2 dedicated servers from OVH.

Take a look at the error below:

=====================================================
root@ns516885:~# pvecm add 158.69.240.50
The authenticity of host '158.69.240.50 (158.69.240.50)' can't be established.
ECDSA key fingerprint is eb:36:6f:1b:64:b8:11:df:c9:53:5c:98:a7:65:38:55.
Are you sure you want to continue connecting (yes/no)? yes
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...
=======================================================

Then when I use 'systemctl status corosync.service' to check what went wrong, it shows me:

=========================================================
root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Sun 2017-04-16 22:55:34 EDT; 4min 7s ago
Process: 3420 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration map access [0]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cmap
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration service [1]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cfg
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cpg
Apr 16 22:55:34 ns516885 corosync[3420]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 16 22:55:34 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 16 22:55:34 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 16 22:55:34 ns516885 systemd[1]: Unit corosync.service entered failed state.

========================================================

dietmar · Apr 17, 2017

masterdaweb said:
I'm trying to add a Node to a Cluster but it always fails, I'm using 2 dedicated servers from OVH.

Does multicast between those nodes works? AFAIK this needs OVH vRack.

masterdaweb · Apr 17, 2017

dietmar said:
Does multicast between those nodes works? AFAIK this needs OVH vRack.

Is it possible to make it work using unicast? Cause I don't have a vRack.

How to figure out without using multicast ? Thanks

dietmar · Apr 17, 2017

masterdaweb said:
How to figure out without using multicast ? Thanks

https://pve.proxmox.com/wiki/Multic....29_instead_of_multicast.2C_if_all_else_fails

masterdaweb · Apr 18, 2017

dietmar said:
https://pve.proxmox.com/wiki/Multic....29_instead_of_multicast.2C_if_all_else_fails

I did, but it is still not working.
Node appears but it stays down.

Error message below:

=================================

Code:

root@ns516885:~# pvecm add 158.69.240.50 -force
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...

====================================

Code:

root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
   Active: failed (Result: exit-code) since Mon 2017-04-17 21:05:14 EDT; 1min 25s ago
  Process: 7251 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cmap
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync configuration service [1]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cfg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cpg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync profile loading service [4]
Apr 17 21:05:14 ns516885 corosync[7251]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 17 21:05:14 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 17 21:05:14 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 17 21:05:14 ns516885 systemd[1]: Unit corosync.service entered failed state.

================================

corosync.conf

Code:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: ns524364
    nodeid: 1
    quorum_votes: 1
    ring0_addr: ns524364
  }

  node {
    name: ns516885
    nodeid: 2
    quorum_votes: 1
    ring0_addr: ns516885
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: mamae
  config_version: 16
  ip_version: ipv4
  secauth: on
  transport: udpu
  version: 2
  interface {
    bindnetaddr: 0.0.0.0
    ringnumber: 0
  }

}

dietmar · Apr 18, 2017

bindnetaddr looks wrong to me?

masterdaweb · Apr 18, 2017

dietmar said:
bindnetaddr looks wrong to me?

I already tried cluster IP, but I got the same error listed above.

I tried then 0.0.0.0 cause Unicast tutorial says to use it if nodes are not in the same subnet.

I really wish that it could work, I tried everything

Search

Search

Add node to a new Cluster (Fail)

masterdaweb

Renowned Member

dietmar

Proxmox Staff Member

masterdaweb

Renowned Member

dietmar

Proxmox Staff Member

masterdaweb

Renowned Member

dietmar

Proxmox Staff Member

masterdaweb

Renowned Member

We value your privacy