Hello,
I have 4 nodes cluster and I need to add new node.
After adding node:
corosync wrote
And "Expected votes" in cluster was raised to 5.
However, new node still shows "waiting for quorum..." and configs in /etc/pve is not synced from the cluster.
Than new node5 is showing just local node with expected 5 votes and all others nodes can't see it through pvecm status.
But it doesn't look like network issue, omping test was successful, I checked hostnames etc. Only error in syslog I found:
Any ideas how to solve it?
I have 4 nodes cluster and I need to add new node.
Code:
proxmox-ve: 5.2-2 (running kernel: 4.15.17-2-pve)
pve-manager: 5.2-1 (running version: 5.2-1/0fcd7879)
pve-kernel-4.15: 5.2-2
pve-kernel-4.15.17-2-pve: 4.15.17-10
pve-kernel-4.15.17-1-pve: 4.15.17-9
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-4
libpve-common-perl: 5.0-31
libpve-guest-common-perl: 2.0-16
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-23
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.0-3
lxcfs: 3.0.0-1
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-18
pve-cluster: 5.0-27
pve-container: 2.0-23
pve-docs: 5.2-4
pve-firewall: 3.0-9
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-5
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.1-5
pve-xtermjs: 1.0-5
qemu-server: 5.0-26
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.9-pve1~bpo9
After adding node:
Code:
pvecm add 172.16.0.2 -ring0_addr node5-corosync -use_ssh
corosync wrote
Code:
Jun 5 14:08:11 node5 corosync[28501]: [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
Jun 5 14:08:11 node5 corosync[28501]: notice [MAIN ] Corosync Cluster Engine ('2.4.2-dirty'): started and ready to provide service.
Jun 5 14:08:11 node5 corosync[28501]: info [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp pie relro bindnow
Jun 5 14:08:11 node5 corosync[28501]: [MAIN ] Corosync built-in features: dbus rdma monitoring watchdog augeas systemd upstart xmlconf qdevices qnetd snmp pie relro bindnow
Jun 5 14:08:11 node5 corosync[28501]: notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Jun 5 14:08:11 node5 corosync[28501]: notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Jun 5 14:08:11 node5 corosync[28501]: [TOTEM ] Initializing transport (UDP/IP Multicast).
Jun 5 14:08:11 node5 corosync[28501]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1
Jun 5 14:08:11 node5 corosync[28501]: notice [TOTEM ] The network interface [172.16.0.6] is now up.
Jun 5 14:08:11 node5 corosync[28501]: [TOTEM ] The network interface [172.16.0.6] is now up.
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync configuration map access [0]
Jun 5 14:08:11 node5 corosync[28501]: info [QB ] server name: cmap
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync configuration service [1]
Jun 5 14:08:11 node5 corosync[28501]: info [QB ] server name: cfg
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Jun 5 14:08:11 node5 corosync[28501]: info [QB ] server name: cpg
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync profile loading service [4]
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync configuration map access [0]
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync resource monitoring service [6]
Jun 5 14:08:11 node5 corosync[28501]: warning [WD ] Watchdog /dev/watchdog exists but couldn't be opened.
Jun 5 14:08:11 node5 corosync[28501]: warning [WD ] resource load_15min missing a recovery key.
Jun 5 14:08:11 node5 corosync[28501]: warning [WD ] resource memory_used missing a recovery key.
Jun 5 14:08:11 node5 corosync[28501]: info [WD ] no resources configured.
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync watchdog service [7]
Jun 5 14:08:11 node5 corosync[28501]: notice [QUORUM] Using quorum provider corosync_votequorum
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
Jun 5 14:08:11 node5 corosync[28501]: info [QB ] server name: votequorum
Jun 5 14:08:11 node5 corosync[28501]: notice [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Jun 5 14:08:11 node5 corosync[28501]: info [QB ] server name: quorum
Jun 5 14:08:11 node5 corosync[28501]: notice [TOTEM ] A new membership (172.16.0.6:12756) was formed. Members joined: 5
Jun 5 14:08:11 node5 corosync[28501]: warning [CPG ] downlist left_list: 0 received
Jun 5 14:08:11 node5 corosync[28501]: [QB ] server name: cmap
Jun 5 14:08:11 node5 systemd[1]: Started Corosync Cluster Engine.
Jun 5 14:08:11 node5 corosync[28501]: notice [QUORUM] Members[1]: 5
Jun 5 14:08:11 node5 corosync[28501]: notice [MAIN ] Completed service synchronization, ready to provide service.
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync configuration service [1]
Jun 5 14:08:11 node5 corosync[28501]: [QB ] server name: cfg
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Jun 5 14:08:11 node5 corosync[28501]: [QB ] server name: cpg
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync profile loading service [4]
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync resource monitoring service [6]
Jun 5 14:08:11 node5 corosync[28501]: [WD ] Watchdog /dev/watchdog exists but couldn't be opened.
Jun 5 14:08:11 node5 corosync[28501]: [WD ] resource load_15min missing a recovery key.
Jun 5 14:08:11 node5 corosync[28501]: [WD ] resource memory_used missing a recovery key.
Jun 5 14:08:11 node5 corosync[28501]: [WD ] no resources configured.
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync watchdog service [7]
Jun 5 14:08:11 node5 corosync[28501]: [QUORUM] Using quorum provider corosync_votequorum
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
Jun 5 14:08:11 node5 corosync[28501]: [QB ] server name: votequorum
Jun 5 14:08:11 node5 corosync[28501]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Jun 5 14:08:11 node5 corosync[28501]: [QB ] server name: quorum
Jun 5 14:08:11 node5 corosync[28501]: [TOTEM ] A new membership (172.16.0.6:12756) was formed. Members joined: 5
Jun 5 14:08:11 node5 corosync[28501]: [CPG ] downlist left_list: 0 received
Jun 5 14:08:11 node5 corosync[28501]: [QUORUM] Members[1]: 5
Jun 5 14:08:11 node5 corosync[28501]: [MAIN ] Completed service synchronization, ready to provide service.
And "Expected votes" in cluster was raised to 5.
However, new node still shows "waiting for quorum..." and configs in /etc/pve is not synced from the cluster.
Than new node5 is showing just local node with expected 5 votes and all others nodes can't see it through pvecm status.
But it doesn't look like network issue, omping test was successful, I checked hostnames etc. Only error in syslog I found:
Code:
Jun 5 14:08:07 node5 pvesr[28252]: trying to aquire cfs lock 'file-replication_cfg' ...
Jun 5 14:08:08 node5 pvesr[28252]: trying to aquire cfs lock 'file-replication_cfg' ...
Jun 5 14:08:09 node5 pvesr[28252]: error with cfs lock 'file-replication_cfg': no quorum!
Jun 5 14:08:09 node5 systemd[1]: pvesr.service: Main process exited, code=exited, status=13/n/a
Any ideas how to solve it?