[SOLVED] Cant join cluster

zarlo5899

Active Member
Aug 14, 2018
17
1
43
i have a node that cant join my cluster all servers are with the same host and have a private LAN

i dont quite know what info is need to be helped so i will give as much as i can think off

the error i get is

Code:
Task viewer: Join Cluster

OutputStatus

Stop
Etablishing API connection with host '64.140.150.229'
Login succeeded.
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1540959378.sql.gz'
Job for corosync.service failed because the control process exited with error code.
TASK ERROR: starting pve-cluster failed: See "systemctl status corosync.service" and "journalctl -xe" for details.

Code:
systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2018-10-30 19:21:53 HDT; 6min ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
  Process: 2514 ExecStart=/usr/sbin/corosync -f $COROSYNC_OPTIONS (code=exited, status=20)
 Main PID: 2514 (code=exited, status=20)
      CPU: 111ms

Oct 30 19:21:53 mainus3 corosync[2514]: info    [WD    ] no resources configured.
Oct 30 19:21:53 mainus3 corosync[2514]: notice  [SERV  ] Service engine loaded: corosync watchdog service [7]
Oct 30 19:21:53 mainus3 corosync[2514]: notice  [QUORUM] Using quorum provider corosync_votequorum
Oct 30 19:21:53 mainus3 corosync[2514]: crit    [QUORUM] Quorum provider: corosync_votequorum failed to initialize.
Oct 30 19:21:53 mainus3 corosync[2514]: error   [SERV  ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expect
Oct 30 19:21:53 mainus3 corosync[2514]: error   [MAIN  ] Corosync Cluster Engine exiting with status 20 at service.c:356.
Oct 30 19:21:53 mainus3 systemd[1]: corosync.service: Main process exited, code=exited, status=20/n/a
Oct 30 19:21:53 mainus3 systemd[1]: Failed to start Corosync Cluster Engine.
Oct 30 19:21:53 mainus3 systemd[1]: corosync.service: Unit entered failed state.
Oct 30 19:21:53 mainus3 systemd[1]: corosync.service: Failed with result 'exit-code'.

Code:
root@mainus2:~# cat /etc/pve/corosync.conf
logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: mainus1
    nodeid: 2
    quorum_votes: 1
    ring0_addr: publicIP1
  }
  node {
    name: mainus2
    nodeid: 1
    quorum_votes: 1
    ring0_addr: publicIP2
  }
  node {
    name: mainus3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: publicIP3
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: USHE000
  config_version: 9
  interface {
    bindnetaddr: publicIP2
    ringnumber: 0
  }
  ip_version: ipv4
  secauth: on
  version: 2
}

Code:
root@mainus3:~# pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.17-1-pve)
pve-manager: 5.2-1 (running version: 5.2-1/0fcd7879)
pve-kernel-4.15: 5.2-1
pve-kernel-4.15.17-1-pve: 4.15.17-9
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-31
libpve-guest-common-perl: 2.0-16
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-23
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-18
pve-cluster: 5.0-27
pve-container: 2.0-23
pve-docs: 5.2-3
pve-firewall: 3.0-8
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-5
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-5
qemu-server: 5.0-26
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.8-pve1~bpo9
 
now i cant login to the web panel for the node im tring to add

now the UI does not load

(i have not edited any thing)
 
Last edited:
Hi,

failed to load for reason 'configuration error:
You have a misconfigured corosync.conf this is why the service can't start.
It is not possible to debug an edit config.
If you mask the IP because they are public then this is indeed a problem.
Corosync should have a dedicated network what support multicast.
Do you have on all nodes the same pveversion level?
 
the ips are only masked for this post
so it could be that publicIP3 is not in the same subnet as publicIP1 and publicIP2
and yes

the first 2 nodes work is just the 3rd on that can join
 
so it could be that publicIP3 is not in the same subnet as publicIP1 and publicIP2
This can't work, as I wrote you need multicast and multicast work normally only in the same subnet.
 
Hi,
In my experience you can't change the IPs effectively, you need to reinstall proxmox with the new IPs entered. Otherwise you get odd errors after reboot or during ops.
Yours
Interfering Andy
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!