Node can't join existing Cluster

magic_tom02

New Member
Apr 2, 2021
1
0
1
23
Hey Proxmox Community,
i have been using my Cluster since a few weeks and everything worked fine.
Last week i added a new node (cluster-server-4) no problems.
Today i wanted to add a new node (cluster-server-5), it joins the cluster but isnt aviable.
1617377818836.png
I have already updated all nodes to the newest proxmox version and rebootet them all.
But i still get the same error.
In the system Logs it shows "corosync[1681]: [TOTEM ] Token has not been received", i have no clue why the token cant be received. The new node is online and works fine in Standalone.
Thats the complete system Log of the main node.
Apr 02 15:46:20 cluster-server-1 sshd[19716]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=45.162.4.175
Apr 02 15:46:22 cluster-server-1 pvedaemon[1726]: <root@pam> successful auth for user 'root@pam'
Apr 02 15:46:22 cluster-server-1 pvedaemon[1726]: <root@pam> adding node cluster-server-5 to cluster
Apr 02 15:46:22 cluster-server-1 pmxcfs[1523]: [dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 11)
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [CFG ] Config reload requested by node 1
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [TOTEM ] Configuring link 0
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [TOTEM ] Configured link number 0: local addr: XXXX, port=5405
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 has no active links
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 has no active links
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 has no active links
Apr 02 15:46:22 cluster-server-1 pmxcfs[1523]: [status] notice: update cluster info (cluster name lit-hosting, version = 11)
Apr 02 15:46:23 cluster-server-1 sshd[19716]: Failed password for invalid user ubuntu from 45.162.4.175 port 35046 ssh2
Apr 02 15:46:24 cluster-server-1 sshd[19716]: Received disconnect from 45.162.4.175 port 35046:11: Bye Bye [preauth]
Apr 02 15:46:24 cluster-server-1 sshd[19716]: Disconnected from invalid user ubuntu 45.162.4.175 port 35046 [preauth]
Apr 02 15:46:25 cluster-server-1 sshd[19714]: Connection closed by 120.53.233.146 port 54760 [preauth]
Apr 02 15:46:27 cluster-server-1 corosync[1681]: [KNET ] rx: host: 5 link: 0 is up
Apr 02 15:46:27 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Apr 02 15:46:27 cluster-server-1 corosync[1681]: [KNET ] pmtud: PMTUD link change for host: 5 link: 0 from 469 to 1397
Apr 02 15:46:31 cluster-server-1 corosync[1681]: [TOTEM ] Token has not been received in 4625 ms
Apr 02 15:46:31 cluster-server-1 sshd[19739]: Invalid user ubuntu from 119.29.98.53 port 54596
Apr 02 15:46:31 cluster-server-1 sshd[19739]: pam_unix(sshd:auth): check pass; user unknown
Apr 02 15:46:31 cluster-server-1 sshd[19739]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=119.29.98.53
Apr 02 15:46:32 cluster-server-1 sshd[19739]: Failed password for invalid user ubuntu from 119.29.98.53 port 54596 ssh2
Apr 02 15:46:33 cluster-server-1 sshd[19760]: Invalid user user from 128.199.110.226 port 47986
Apr 02 15:46:33 cluster-server-1 sshd[19760]: pam_unix(sshd:auth): check pass; user unknown
Apr 02 15:46:33 cluster-server-1 sshd[19760]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=128.199.110.226
Apr 02 15:46:33 cluster-server-1 sshd[19739]: Received disconnect from 119.29.98.53 port 54596:11: Bye Bye [preauth]
Apr 02 15:46:33 cluster-server-1 sshd[19739]: Disconnected from invalid user ubuntu 119.29.98.53 port 54596 [preauth]
Apr 02 15:46:35 cluster-server-1 sshd[19760]: Failed password for invalid user user from 128.199.110.226 port 47986 ssh2
Apr 02 15:46:36 cluster-server-1 corosync[1681]: [TOTEM ] Token has not been received in 9577 ms
Apr 02 15:46:36 cluster-server-1 sshd[19760]: Received disconnect from 128.199.110.226 port 47986:11: Bye Bye [preauth]
Apr 02 15:46:36 cluster-server-1 sshd[19760]: Disconnected from invalid user user 128.199.110.226 port 47986 [preauth]
Apr 02 15:46:39 cluster-server-1 corosync[1681]: [QUORUM] Sync members[4]: 1 2 3 4
Apr 02 15:46:39 cluster-server-1 corosync[1681]: [TOTEM ] A new membership (1.8be) was formed. Members
Apr 02 15:46:40 cluster-server-1 sshd[19762]: Invalid user oracle from 81.69.251.177 port 41460
Apr 02 15:46:40 cluster-server-1 sshd[19762]: pam_unix(sshd:auth): check pass; user unknown
Apr 02 15:46:40 cluster-server-1 sshd[19762]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=81.69.251.177
Apr 02 15:46:41 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 10
Apr 02 15:46:42 cluster-server-1 sshd[19762]: Failed password for invalid user oracle from 81.69.251.177 port 41460 ssh2
Apr 02 15:46:42 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 20
Apr 02 15:46:43 cluster-server-1 corosync[1681]: [TOTEM ] Token has not been received in 3713 ms
Apr 02 15:46:43 cluster-server-1 sshd[19762]: Received disconnect from 81.69.251.177 port 41460:11: Bye Bye [preauth]
Apr 02 15:46:43 cluster-server-1 sshd[19762]: Disconnected from invalid user oracle 81.69.251.177 port 41460 [preauth]
Apr 02 15:46:43 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 30
Apr 02 15:46:44 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 40
Apr 02 15:46:45 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 50
Apr 02 15:46:46 cluster-server-1 sshd[19781]: Invalid user vyatta from 124.28.218.130 port 25636
Apr 02 15:46:46 cluster-server-1 sshd[19781]: pam_unix(sshd:auth): check pass; user unknown
Apr 02 15:46:46 cluster-server-1 sshd[19781]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=124.28.218.130
Apr 02 15:46:46 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 60
Apr 02 15:46:48 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 70
Apr 02 15:46:48 cluster-server-1 corosync[1681]: [TOTEM ] Token has not been received in 8665 ms
Apr 02 15:46:48 cluster-server-1 sshd[19781]: Failed password for invalid user vyatta from 124.28.218.130 port 25636 ssh2
Apr 02 15:46:48 cluster-server-1 sshd[19781]: Received disconnect from 124.28.218.130 port 25636:11: Bye Bye [preauth]
Apr 02 15:46:48 cluster-server-1 sshd[19781]: Disconnected from invalid user vyatta 124.28.218.130 port 25636 [preauth]
Apr 02 15:46:49 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 80
Apr 02 15:46:50 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 90
Apr 02 15:46:51 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 100
Apr 02 15:46:51 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retried 100 times
Apr 02 15:46:51 cluster-server-1 pmxcfs[1523]: [status] crit: cpg_send_message failed: 6
Apr 02 15:46:51 cluster-server-1 corosync[1681]: [QUORUM] Sync members[4]: 1 2 3 4
Apr 02 15:46:51 cluster-server-1 corosync[1681]: [TOTEM ] A new membership (1.8ca) was formed. Members
Apr 02 15:46:52 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 10
Apr 02 15:46:53 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 20
Apr 02 15:46:54 cluster-server-1 pmxcfs[1523]: [status] notice: cpg_send_message retry 30
Apr 02 15:46:54 cluster-server-1 corosync[1681]: [TOTEM ] Token has not been received in 3713 ms

The next thing is after i tried to join the cluster, the cluster server loads very slow and shows this in the cluster section:
1617378282473.png

I didnt find anything about this problem hope someone has an idea.


Thanks for replying :)
Tom
 
Last edited:
Apr 02 15:46:22 cluster-server-1 corosync[1681]: [KNET ] host: host: 5 has no active links

Looks like Corosync cannot communicate with node 5.

  • Are the IPs used for Corosync all in the right subnet for node 5 compared to the other nodes?
  • Can you ping node 5 in these subnets from the other nodes and vice versa?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!