QDevice w/odd # nodes - docs discourage from perfectly safe setup?

esi_y

Renowned Member
Nov 29, 2023
2,221
373
68
github.com
There's a great deal of misleading piece of argument to be found in currently official PVE docs on QDevices [1] and then some conscious effort to take it even further.

Under "Supported setups" [1] the following is advised (emphasis mine):

We support QDevices for clusters with an even number of nodes and recommend it for 2 node clusters, if they should provide higher availability. For clusters with an odd node count, we currently discourage the use of QDevices.

It continues to provide an absurd piece of reasoning (only focusing on the odd, i.e. "discouraged" setup) in such case regarding that "QDevice provides (N-1) votes" and therefore "QDevice acts almost as a single point of failure" (emphasis mine).

This is, of course, blatantly wrong, it is mixing up even/odd nodes scenarios with algos (ffsplit and lms) when it comes to QD - but the ffsplit is default [2]. In ffsplit, the QD definitely does not have N-1 votes, it has exactly 1 vote whether it's even or odd number of votes - except it does not when it comes to PVE.

What PVE does is that first, it makes the impression QD cannot be installed for odd number of nodes by throwing [3] a tantrum:
Code:
Clusters with an odd node count are not officially supported!

What it does not advise is that there's a --force switch, but at least it is (somewhat) in the docs [4]:
Code:
--force <boolean>
Do not throw error on possible dangerous operations.

So unsure how this is a dangerous operation, but when executed with --force, PVE explicitly changes the algo to lms (without advising this time, also it is not mentioned anywhere in the docs). In that case, it would indeed be that QD gets N-1 votes, except this has nothing to do with odd number of nodes, let alone default setup. Ironically, there's no manual switch to tell pvecm what algo one wants.

And finally, going back to the docs to top it off:

The fact that all but one node plus QDevice may fail sounds promising at first, but this may result in a mass recovery of HA services, which could overload the single remaining node. Furthermore, a Ceph server will stop providing services if only ((N-1)/2) nodes or less remain online.

First of all, HA has nothing to do with corosync (there's probably more users running clusters w/o HA than those w/HA enabled) and that the HA stack is immature when it comes to CRS is not the fault of corosync. Even if HA is used (on more remaining nodes), this is clearly known to the user (from observing the behaviour of HA even in case of single node fail) that it may overload the surviving nodes (however many) and there's absolutely no relation of this piece of advice to lms, let alone to QD as such, and completely irrespective of whether it's odd or even nodes cluster. The CEPH part of course can be accounted for by setting expected to 2, but again this is out of scope and it has nothing to do with QD setup on odd number of nodes, it's to do with lms, which was brought in by PVE script in the first place.

I do note that one is free to decide for themselves in the conclusion of the same para of the docs [1], however currently there's literally everything done to prevent the default QD setup for odd number of nodes - you cannot get this unless manually overriding what PVE scripting did to have:
Code:
quorum {
  device {
    model: net
    net {
      algorithm: ffsplit
      host: 1.2.3.4
      tls: on
    }
    votes: 1
  }
  provider: corosync_votequorum
}

I really do not understand what was the purpose of all of these concerted efforts, especially it's completely undocumented.

[1] https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
[2] https://manpages.debian.org/testing/corosync-qdevice/corosync-qdevice.8.en.html
[3] https://github.com/proxmox/pve-clus...4c05a11b0f864f5b9dc/src/PVE/CLI/pvecm.pm#L136
[4] https://pve.proxmox.com/pve-docs/pvecm.1.html
 
Last edited:
  • Like
Reactions: cluck
Thanks @bjodah ! I actually posted this to potentially solicit some reaction from staff. From my other discussions in my other threads I realised QD was something they did not really pay attention to in terms of support. I left it here more as a reference for someone ... like you. :)

I noted that even Jan Friesse mentions in some earlier materials that ffsplit only makes sense for even-numbered members in a cluster. I think he disregarded the network split potential issue (that it solves for such clusters too). I can only guess this is why PVE staff went on with lms as default for their "borderline supported" setup. Right, @fabian? :D
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!