Cluster broken after add node failed

Arne Kröger

Member
Jan 31, 2019
2
1
8
37
Hello,

we have a 16 node cluster. While adding the last node there was an timeout and after that the cluster got broken and the nodes can no longer see each other.

The pvecm nodes just lists his own node on every server. The corosync is using the CPU ~ 300%.

The pvecm status is showing a failed quorum status.

Code:
root@cluster-a1:~# pvecm status
Cluster information
-------------------
Name:             cluster-a
Config Version:   24
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sat Jun 26 00:32:18 2021
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          0x00000001
Ring ID:          1.665
Quorate:          No

Votequorum information
----------------------
Expected votes:   16
Highest expected: 16
Total votes:      1
Quorum:           9 Activity blocked
Flags:           

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 XX.XXX.X.XXX (local)

The nodes can see each other in the network (ping).

I already restartet the pve-cluster and the corosync services but this does not help.

I can see the following entries flooding the syslog:

corosync[128496]: [KNET ] loopback: send local failed. error=Resource temporarily unavailable

Can you help us out here?

Thanks Arne
 
Please post the Corosync config and the output of journalctl -u pve-cluster also the pveversion -v as well

Have you checked the /etc/hosts file?
 
Hi Moayad,

we found the problem. The cluster which was marked as totem for corosync was out of disk space. The space was used because of mass log entries due to network problems during node add.

After freeing some space and manually removing broken node from cluster the nodes came back.

We're currently checking some additionals but the cluster is back again.

Thanks for your help,
Arne
 
  • Like
Reactions: Moayad

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!