pvecm quorum issue

Denary

New Member
Sep 21, 2024
2
0
1
Hi

I've a two node cluster with a qdevice running on a synology nas vm. I've been replacing both nodes as they were both starting to pose issues.

Old1 / Old2
New1 / New2

I disconnected "old1" and added "new1" to the cluster and checked that everything was working fine. Forgot to check that the corosync-qdevice was working on the new node. I also made a mistake removing "old2" before it was shut down so it's not failed on the quorum checks before being removed from the cluster.

Now I can't add "new2" because the cluster is not quorate, I can't remove the qdevice nor can I add the qdevice back in with the force command. "pvecm qdevice setup x.x.x.x --force". I've tried using "pvecm expected 1" but it doesn't actually change the expected votes and leaves the cluster in a locked down state.

Is there another way to force the cluster to release it's lock? The cluster only consists of one node right now so I'm not concerned about split brain.

Both new nodes are on 8.2.7
Old nodes were on 8.2.4
 
Last edited:
So now you have 2 new1 & new2, but can't add the new2. So let's maybe make that QD work first for new1?

Can you post your /etc/corosync/corosync.conf (not the /etc/pve one) from both nodes?

Can you post (from new1):
Code:
corosync-cfgtool -s
corosync-qdevice-tool -s
pvecm status
 
It's been a few hours.

Because all VM's and CT's are running on iscsi and "new1" was in a locked state with all VM's down I was able to just copy the .conf files of all VM's and CT's onto the new box and test spin them up there without any changes.

I've now wiped and reinstalled Proxmox on "new1" and joined it to a new cluster on "new2" and all is working again. Bit extreme but no data loss was the main concern.

Thanks for the response though! Ideally I'll just not get in this situation again...
 
It's been a few hours.

Because all VM's and CT's are running on iscsi and "new1" was in a locked state with all VM's down I was able to just copy the .conf files of all VM's and CT's onto the new box and test spin them up there without any changes.

I've now wiped and reinstalled Proxmox on "new1" and joined it to a new cluster on "new2" and all is working again. Bit extreme but no data loss was the main concern.

Thanks for the response though! Ideally I'll just not get in this situation again...

No worries! Good that you have it working then!

(I think, I can only guess, that you basically did not have the QD on the new1 when you added it, or you did not have the corosync-qdevice daemon installed/running there.)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!