In my home-lab I run three NUCs in a 24x7 HCI cluster and then I may add workstations on demand, that normally run some other operating systems or are simply turned off, as on-demand compute nodes for experimentation, typically with VMs running GPU pass-through for CUDA.
Been doing that for years with RHV/oVirt, tried it a bit Xcp-ng (not so promising) and now trying the same with Proxmox.
With oVirt I've run across an issue, that I'd rather not have with Proxmox, so before I do, I'd like to have some feedback:
The main difference with the permanent nodes and the on-demand ones is that only the permanent nodes contribute to the [shared] storage, the ad-hoc compute nodes are meant to remain pretty much stateless, so they can be turned off (or used otherwise). Since these on-demand node do not contribute Gluster bricks or CEPH ODS, their running state should not have any impact on storage consistency or state.
If you have six total nodes, and only three are contributing HCI storage, the three on-demand nodes should not be involved in quorum counting... unless you try involving them in tie splitting for a reason.
With RHV/oVirt that doesn't always work as expected, I've seen quorum loss reports when nodes that didn't contribute bricks to a given gluster volume were shut down, even if they only had other volumes. But at least it worked, when they didn't contribute Gluster volumes at all.
With Proxmox what has me concerned is that each node is given a "vote", supposedly on the corosync file system.
Does that mean I'll get into trouble when I add more on-demand non-storage nodes than HCI nodes? Or will things already go South if my (currently) two ad-hoc nodes are shut down and I reboot one of my three-replica CEPH nodes, because only 2 out of 5 nodes (corosync quota) but 2 out of 3 storage nodes (Ceph quota) remain running?
And how much trouble would that be? Will I just not be able to create, launch or migrate VMs until corosync quota is back?
I guess currently Proxmox treats all nodes the same at the corosync level, which is why you can manage the cluster just the same from every node, as long as you have a majority.
If you wanted to support a majority of on-demand nodes, you'd have to differentiate between votes from permanent nodes and on-demand nodes and treat the corosync file system as read-only from the on-demand nodes. They'd then have to proxy via one of the permanent nodes for their own operations.
I guess I could add a little Proxmox VM (or container?) on each one of the permanent nodes to give them double votes for a similar effect...
I quite like the ability to boot both my CUDA workstations and my kid's gaming rigs (which often are my older CUDA workstations) off one of these super fast Kingston data traveller USB sticks with a basic Proxmox and then have them run VMs from the HCI cluster for machine learning workloads...
But I can see that running into trouble with the next one and while dropping a node in RHV/oVirt is a simple GUI action (it can then also be re-inserted just as easily), in Proxmox that doesn't seem to be the case.
Been doing that for years with RHV/oVirt, tried it a bit Xcp-ng (not so promising) and now trying the same with Proxmox.
With oVirt I've run across an issue, that I'd rather not have with Proxmox, so before I do, I'd like to have some feedback:
The main difference with the permanent nodes and the on-demand ones is that only the permanent nodes contribute to the [shared] storage, the ad-hoc compute nodes are meant to remain pretty much stateless, so they can be turned off (or used otherwise). Since these on-demand node do not contribute Gluster bricks or CEPH ODS, their running state should not have any impact on storage consistency or state.
If you have six total nodes, and only three are contributing HCI storage, the three on-demand nodes should not be involved in quorum counting... unless you try involving them in tie splitting for a reason.
With RHV/oVirt that doesn't always work as expected, I've seen quorum loss reports when nodes that didn't contribute bricks to a given gluster volume were shut down, even if they only had other volumes. But at least it worked, when they didn't contribute Gluster volumes at all.
With Proxmox what has me concerned is that each node is given a "vote", supposedly on the corosync file system.
Does that mean I'll get into trouble when I add more on-demand non-storage nodes than HCI nodes? Or will things already go South if my (currently) two ad-hoc nodes are shut down and I reboot one of my three-replica CEPH nodes, because only 2 out of 5 nodes (corosync quota) but 2 out of 3 storage nodes (Ceph quota) remain running?
And how much trouble would that be? Will I just not be able to create, launch or migrate VMs until corosync quota is back?
I guess currently Proxmox treats all nodes the same at the corosync level, which is why you can manage the cluster just the same from every node, as long as you have a majority.
If you wanted to support a majority of on-demand nodes, you'd have to differentiate between votes from permanent nodes and on-demand nodes and treat the corosync file system as read-only from the on-demand nodes. They'd then have to proxy via one of the permanent nodes for their own operations.
I guess I could add a little Proxmox VM (or container?) on each one of the permanent nodes to give them double votes for a similar effect...
I quite like the ability to boot both my CUDA workstations and my kid's gaming rigs (which often are my older CUDA workstations) off one of these super fast Kingston data traveller USB sticks with a basic Proxmox and then have them run VMs from the HCI cluster for machine learning workloads...
But I can see that running into trouble with the next one and while dropping a node in RHV/oVirt is a simple GUI action (it can then also be re-inserted just as easily), in Proxmox that doesn't seem to be the case.
Last edited: