[SOLVED] Cluster Quorum issue

manuelkamp

Member
May 10, 2022
33
8
13
Hi, I have a two-node cluster with pbs as qdevice. Recently I had issues with the cluster crashing when restarting one node, so I looked into pvecm status and saw that one node is status "NR". As i understand, this means it is not ready. But how do I fix that?

Code:
Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW 10.0.1.2 (local)
0x00000002          1         NR 10.0.1.4
0x00000000          1            Qdevice
 
As i understand, this means it is not ready
AFAIK this means "Not registered". Link.

I believe this means your Qdevice has not been registered with that specific node.

Maybe the best approach would be to remove that Qdevice & then add it again, as detailed in the above link.

Can you SSH between all the nodes & Qdevice (every combination)?
 
I have connection between all devices, but please note that the "NR" is on a node, not the qdevice (pbs).
 
When you deploy Qdevice from PVE cluster, a public key is installed in the QDevice so it can talk with every node in the cluster. For some reason when you deployed Qdevice something failed. Make sure apt install corosync-qdevice is run on all PVE nodes, remove the QDevice with pvecm qdevice remove and add it back again with pvecm qdevice setup <QDEVICE-IP>. Triple check the outout of this last command, as it tends to hide errors due to the long output.
 
but please note that the "NR" is on a node, not the qdevice (pbs).
If you look at your output (in your post) the column is titled "Qdevice", this info refers to the Qdevice in connection/regards to the specific node being listed. So as far as the node with the id 0x00000002 is concerned, the Qdevice is "NR"; Not registered.
 
removing and adding the qdevice did not solve this issue. the only thing which helped me solve my issue was that I installed proxmox fresh on both nodes and rebuild the cluster with the two nodes and adding the qdevice. (it was no problem at all, since all ct/vm data is in zfs pools which I just imported back again). I think the reason behind all that was, that the os disk on my 2nd node had a failure a few weeks ago and i just inserted a new m.2 and installed proxmox fresh on this node, using the same machine name as it was before.
 
Good, looks like you got it solved. Maybe mark this thread as solved. At the top of the thread, choose the Edit thread button, then from the (no prefix) dropdown choose Solved.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!