Suggesting a patch for pvecm? (adding qdevice when nodes are odd, but votes are even)

bjodah

New Member
Feb 25, 2024
2
0
1
Hi all,

First off: I'm new to proxmox, and I only use this is a "homelab" setting. Proxmox (the software), this forum and it's members is a true gem, so big thanks all around!

I just finished setting up a cluster with 3 nodes, where 2 of the nodes are expected to be on and off intermittently (for energy considerations). I settled on giving the always-on node 2 votes, and the remaining two nodes one vote each. This gives me an even number of votes, which is the reason for why I added a dedicated QDevice (on some surplus low power hardware).

However, I ran into the following problem when I tried to set this up:
1. An odd number of *nodes* requires me to add the `--force` flag: pvecm qdevice setup --force true 172.16.x.y
2. This in turn gives the lms algortihm: https://github.com/proxmox/pve-clus...396b2a9b72a288771d8/src/PVE/CLI/pvecm.pm#L134
3. which in turn omits the "votes:" directive (I think) of the device: https://github.com/proxmox/pve-clus...a288771d8/src/PVE/CLI/pvecm.pm#L233C6-L233C60

so when pvecm tries to restart the corosync-qdevice service on the nodes, this fails since I had previously configured one of the nodes to have a vote larger than one (for which an explicit "votes:" parameter is expected in the device block, as per the error message in the logs of the failed service).

I simply added the missing `votes:` field to quorom.device and scp'ed the patched /etc/corosync/corosync.conf file to the nodes. After this I could edit /etc/pve/corosync.conf to match my updates, and then restart the corosync-qdevice service on all my nodes.

This let me achieve the, albeit unconventional, setup I was looking for:
Code:
Cluster information
-------------------
Name:             kluster
Config Version:   11
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sun Feb 25 13:51:57 2024
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1.1f
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3 
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000001          2    A,V,NMW 172.16.42.2 (local)
0x00000002          1    A,V,NMW 172.16.42.4
0x00000003          1    A,V,NMW 172.16.42.5
0x00000000          1            Qdevice

Is there a reason why the "votes:" field is omitted for "lms"? I guess last man standing will use a dynamic number of nodes. But then the question is why "lms" is chosen when the flag `--force` is given? The fifty-fifty split "ffsplit" should perhaps still be an option? And then I have the feeling that requiring the `--force` flag for odd number of nodes is not as sharp as requiring this for odd number of votes? Then again, I'm very new to this field, so please forgive my ignorance if this sounds like misguided ramblings more than anything else.
 
I am also going to reply here (apart from [1]), since I think you disregard that lms gives the QD votes equal to members - 1. Hence why the other requirements. But I think the general topic from [1] is relevant, why "forcing" one to have lms is beyond me.

EDIT: Do not use more than 1 vote per node, for what you want, the proper way to do it is use tie breaker.

More importantly, it's NOT DOCUMENTED at all!

[1] https://forum.proxmox.com/threads/q...from-perfectly-safe-setup.139809/#post-638311
 
Last edited:
Oh yeah, and two pieces of advice:

1. Your title "Suggesting a patch for pvecm? (adding qdevice when nodes are odd, but votes are even)" will get a response:
=> Use Bugzilla, where all such bugs/feature requests sleep like a Snowy White;

2. Your intro:
I only use this is a "homelab" setting. Proxmox (the software)
=> implies ROI for staff being interested in this dropped to sub-zero.

And I suspect there's the Catch 22 issue as always:

PVE does not encourage this setup, hence they do not support it, hence they hardcoded something even someone who did that might have forgotten why it was originally so, hence it's not really documented, hence no one else can append this to the documentation, hence no one runs this in production, hence there's no ROI to support it, hence ... read from the beginning.


But! They are nice people. I mean it! :D
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!