new node join failed and nothing can be operated when logging into the GUI.

stats

Well-Known Member
Mar 6, 2017
45
1
48
I tried to add the 4th node in the cluster (vgpm04/172.19.0.14) from the GUI and but it failed in the process.
Currently, nothing can be operated when logging into the GUI.

We stopped the pve-cluster service and the colosync service on vgpm04, but this did not improve the situation.
I send you the results of the command and the syslog output of each server.

How can I get back to normal?

Code:
root@vgpm03:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 vgpm01
         2          1 vgpm02
         3          1 vgpm03 (local)

Code:
root@vgpm03:~# pvecm status
Cluster information
-------------------
Name:             vgcluster01
Config Version:   4
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sat May  4 19:55:40 2024
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000003
Ring ID:          1.2643
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      3
Quorum:           3 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 172.19.0.11
0x00000002          1 172.19.0.12
0x00000003          1 172.19.0.13 (local)

Code:
--- vgpm01 syslog output ----
May 11 19:49:27 vgpm01 corosync[2086]:   [QUORUM] Sync members[3]: 1 2 3
May 11 19:49:27 vgpm01 corosync[2086]:   [TOTEM ] A new membership (1.ff37f) was formed. Members
May 11 19:49:28 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 40
May 11 19:49:29 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 50
May 11 19:49:30 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 60
May 11 19:49:30 vgpm01 corosync[2086]:   [TOTEM ] Token has not been received in 3226 ms
May 11 19:49:31 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 70
May 11 19:49:32 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 80
May 11 19:49:33 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 90
May 11 19:49:34 vgpm01 corosync[2086]:   [TOTEM ] Token has not been received in 6927 ms
May 11 19:49:34 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 100
May 11 19:49:34 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retried 100 times
May 11 19:49:34 vgpm01 pmxcfs[2045]: [status] crit: cpg_send_message failed: 6
May 11 19:49:35 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 10
May 11 19:49:36 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 20
May 11 19:49:37 vgpm01 pmxcfs[2045]: [status] notice: cpg_send_message retry 30
May 11 19:49:37 vgpm01 corosync[2086]:   [QUORUM] Sync members[3]: 1 2 3

Code:
--- vgpm02 syslog output ----
May 11 19:54:34 vgpm02 corosync[2197]:   [QUORUM] Sync members[3]: 1 2 3
May 11 19:54:34 vgpm02 corosync[2197]:   [QUORUM] Sync joined[1]: 1
May 11 19:54:34 vgpm02 corosync[2197]:   [QUORUM] Sync left[1]: 1
May 11 19:54:34 vgpm02 corosync[2197]:   [TOTEM ] A new membership (1.ff5c7) was formed. Members joined: 1
May 11 19:54:34 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 10
May 11 19:54:35 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 20
May 11 19:54:36 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 30
May 11 19:54:37 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 40
May 11 19:54:37 vgpm02 corosync[2197]:   [TOTEM ] Token has not been received in 3226 ms
May 11 19:54:38 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 50
May 11 19:54:39 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 60
May 11 19:54:40 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 70
May 11 19:54:41 vgpm02 corosync[2197]:   [TOTEM ] Token has not been received in 6926 ms
May 11 19:54:41 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 80
May 11 19:54:42 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 90
May 11 19:54:43 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 100
May 11 19:54:43 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retried 100 times
May 11 19:54:43 vgpm02 pmxcfs[2110]: [status] crit: cpg_send_message failed: 6
May 11 19:54:44 vgpm02 pmxcfs[2110]: [status] notice: cpg_send_message retry 10

Code:
--- vgpm03 syslog output ----
May 11 19:53:35 vgpm03 corosync[2124]:   [TOTEM ] A new membership (1.ff55f) was formed. Members
May 11 19:53:38 vgpm03 corosync[2124]:   [TOTEM ] Token has not been received in 2738 ms
May 11 19:53:42 vgpm03 corosync[2124]:   [TOTEM ] Token has not been received in 6438 ms
May 11 19:53:45 vgpm03 corosync[2124]:   [QUORUM] Sync members[3]: 1 2 3
May 11 19:53:45 vgpm03 corosync[2124]:   [TOTEM ] A new membership (1.ff573) was formed. Members
May 11 19:53:48 vgpm03 corosync[2124]:   [TOTEM ] Token has not been received in 2738 ms
May 11 19:53:52 vgpm03 corosync[2124]:   [TOTEM ] Token has not been received in 6438 ms

Code:
--- vgpm04 syslog output ----
May 11 19:52:00 vgpm04 systemd[1]: Starting Proxmox VE replication runner...
May 11 19:52:00 vgpm04 pvesr[57417]: cfs-lock 'file-replication_cfg' error: no quorum!
May 11 19:52:00 vgpm04 systemd[1]: pvesr.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
May 11 19:52:00 vgpm04 systemd[1]: pvesr.service: Failed with result 'exit-code'.
May 11 19:52:00 vgpm04 systemd[1]: Failed to start Proxmox VE replication runner.
May 11 19:52:01 vgpm04 cron[1945]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
May 11 19:52:02 vgpm04 corosync[31202]:   [QUORUM] Sync members[1]: 4
May 11 19:52:02 vgpm04 corosync[31202]:   [TOTEM ] A new membership (4.ff4ab) was formed. Members
May 11 19:52:02 vgpm04 corosync[31202]:   [QUORUM] Members[1]: 4
May 11 19:52:02 vgpm04 corosync[31202]:   [MAIN  ] Completed service synchronization, ready to provide service.
May 11 19:52:04 vgpm04 pveproxy[57414]: worker exit
May 11 19:52:04 vgpm04 pveproxy[57415]: worker exit
May 11 19:52:04 vgpm04 pveproxy[57416]: worker exit
May 11 19:52:05 vgpm04 pveproxy[1995]: worker 57414 finished
May 11 19:52:05 vgpm04 pveproxy[1995]: starting 1 worker(s)
May 11 19:52:05 vgpm04 pveproxy[1995]: worker 57426 started
May 11 19:52:05 vgpm04 pveproxy[1995]: worker 57415 finished
May 11 19:52:05 vgpm04 pveproxy[1995]: starting 1 worker(s)
May 11 19:52:05 vgpm04 pveproxy[1995]: worker 57416 finished
May 11 19:52:05 vgpm04 pveproxy[1995]: worker 57427 started
May 11 19:52:05 vgpm04 pveproxy[57426]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1917.
May 11 19:52:05 vgpm04 pveproxy[57427]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1917.
 
Hi,

Didi you solved the issue? If no - Could you please provide us with the outputs of the following commands from all nodes:
Bash:
cat /etc/pve/corosync.conf
cat /etc/pve/.members