Add node to a new Cluster (Fail)

masterdaweb · Apr 17, 2017

I'm trying to add a Node to a Cluster but it always fails, I'm using 2 dedicated servers from OVH.

Take a look at the error below:

=====================================================
root@ns516885:~# pvecm add 158.69.240.50
The authenticity of host '158.69.240.50 (158.69.240.50)' can't be established.
ECDSA key fingerprint is eb:36:6f:1b:64:b8:11:df:c9:53:5c:98:a7:65:38:55.
Are you sure you want to continue connecting (yes/no)? yes
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...
=======================================================

Then when I use 'systemctl status corosync.service' to check what went wrong, it shows me:

=========================================================
root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
Active: failed (Result: exit-code) since Sun 2017-04-16 22:55:34 EDT; 4min 7s ago
Process: 3420 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration map access [0]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cmap
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync configuration service [1]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cfg
Apr 16 22:54:33 ns516885 corosync[3429]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 16 22:54:33 ns516885 corosync[3429]: [QB ] server name: cpg
Apr 16 22:55:34 ns516885 corosync[3420]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 16 22:55:34 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 16 22:55:34 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 16 22:55:34 ns516885 systemd[1]: Unit corosync.service entered failed state.

========================================================

dietmar · Apr 17, 2017

masterdaweb said:
I'm trying to add a Node to a Cluster but it always fails, I'm using 2 dedicated servers from OVH.

Does multicast between those nodes works? AFAIK this needs OVH vRack.

masterdaweb · Apr 17, 2017

dietmar said:
Does multicast between those nodes works? AFAIK this needs OVH vRack.

Is it possible to make it work using unicast? Cause I don't have a vRack.

How to figure out without using multicast ? Thanks

dietmar · Apr 17, 2017

masterdaweb said:
How to figure out without using multicast ? Thanks

https://pve.proxmox.com/wiki/Multic....29_instead_of_multicast.2C_if_all_else_fails

masterdaweb · Apr 18, 2017

dietmar said:
https://pve.proxmox.com/wiki/Multic....29_instead_of_multicast.2C_if_all_else_fails

I did, but it is still not working.
Node appears but it stays down.

Error message below:

=================================

Code:

root@ns516885:~# pvecm add 158.69.240.50 -force
can't create shared ssh key database '/etc/pve/priv/authorized_keys'
copy corosync auth key
stopping pve-cluster service
backup old database
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.
waiting for quorum...

====================================

Code:

root@ns516885:~# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled)
   Active: failed (Result: exit-code) since Mon 2017-04-17 21:05:14 EDT; 1min 25s ago
  Process: 7251 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE)

Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cmap
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync configuration service [1]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cfg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Apr 17 21:04:13 ns516885 corosync[7261]: [QB    ] server name: cpg
Apr 17 21:04:13 ns516885 corosync[7261]: [SERV  ] Service engine loaded: corosync profile loading service [4]
Apr 17 21:05:14 ns516885 corosync[7251]: Starting Corosync Cluster Engine (corosync): [FAILED]
Apr 17 21:05:14 ns516885 systemd[1]: corosync.service: control process exited, code=exited status=1
Apr 17 21:05:14 ns516885 systemd[1]: Failed to start Corosync Cluster Engine.
Apr 17 21:05:14 ns516885 systemd[1]: Unit corosync.service entered failed state.

================================

corosync.conf

Code:

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: ns524364
    nodeid: 1
    quorum_votes: 1
    ring0_addr: ns524364
  }

  node {
    name: ns516885
    nodeid: 2
    quorum_votes: 1
    ring0_addr: ns516885
  }

}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: mamae
  config_version: 16
  ip_version: ipv4
  secauth: on
  transport: udpu
  version: 2
  interface {
    bindnetaddr: 0.0.0.0
    ringnumber: 0
  }

}

dietmar · Apr 18, 2017

bindnetaddr looks wrong to me?

masterdaweb · Apr 18, 2017

dietmar said:
bindnetaddr looks wrong to me?

I already tried cluster IP, but I got the same error listed above.

I tried then 0.0.0.0 cause Unicast tutorial says to use it if nodes are not in the same subnet.

I really wish that it could work, I tried everything

Search

Search

Add node to a new Cluster (Fail)

masterdaweb

Active Member

dietmar

Proxmox Staff Member

masterdaweb

Active Member

dietmar

Proxmox Staff Member

masterdaweb

Active Member

dietmar

Proxmox Staff Member

masterdaweb

Active Member