Limit hosts on which a VM can run

cyruspy

Renowned Member
Jul 2, 2013
116
8
83
Would it be possible to limit any given VM to a subset of nodes in a cluster?.

Use case:
- I have a 4 node cluster, which run Ceph as a storage layer
- I need to run VMs equivalent to 1 physical node with Windows Server
- I will license 2 nodes for Windows Server (node1 & node2)
- I want to make sure the VM never touches node 3 & node 4.
 
- I have a 4 node cluster, which run Ceph as a storage layer
- I need to run VMs equivalent to 1 physical node with Windows Server
- I will license 2 nodes for Windows Server (node1 & node2)
Maybe don't cluster Proxmox and only cluster Ceph? Then use two separate single-node Proxmox for each Windows VM and a separate Proxmox cluster (with QDevice) for the other VMs.
You can use Proxmox Datacenter Manager to manage the three separate clusters from one place.
 
  • Like
Reactions: Johannes S
Maybe don't cluster Proxmox and only cluster Ceph? Then use two separate single-node Proxmox for each Windows VM and a separate Proxmox cluster (with QDevice) for the other VMs.
You can use Proxmox Datacenter Manager to manage the three separate clusters from one place.
Sounds more complex than it needs to be....

I can get away with just limiting where the Windows Servers are instantiated, would hate to screw up a perfectly functional PVE cluster that's running other non Windows Server workloads.
 
Last edited:
Manual operator mistake: Would require a mandatory VM to host affinity.
Maybe this? Add a newly named Storage backend to only nodes 1 & 2, then add a disk (dummy) to that VM from that storage - migration will then be unsuccessful to other nodes that do not have that Storage backend.
 
Last edited:
  • Like
Reactions: waltar
I got around to testing this today. If a VM in a group is running and manually migrated to a server outside of the group, it will immediately migrate off that server, within a few seconds. (this can be done with an empty VM, no disk or OS needed)

I didn't try it not started, this time, but I did before and as I recall that does stay on the "wrong" server which means it's when HA sees the VM started that it moves it off.

And for Windows SPLA purposes, AFAIK if the VM is (stays) off no license is required for the month.
 
Maybe this? Add a newly named Storage backend to only nodes 1 & 2, then add a disk (dummy) to that VM from that storage - migration will then be unsuccessful to other nodes that do not have that Storage backend.
Can I add another pool on the same Ceph backend maybe?.

This would solve the "must not touch an unlicensed node" requirement.
 
I got around to testing this today. If a VM in a group is running and manually migrated to a server outside of the group, it will immediately migrate off that server, within a few seconds. (this can be done with an empty VM, no disk or OS needed)

I didn't try it not started, this time, but I did before and as I recall that does stay on the "wrong" server which means it's when HA sees the VM started that it moves it off.

And for Windows SPLA purposes, AFAIK if the VM is (stays) off no license is required for the month.
The issue is it must not be started on an unlicensed node.

But interesting nonetheless, it could cover a "should not" scenario without the "license disk" trick.
 
I don't use Ceph myself - but probably create a new Ceph Pool with a specific name, lets call it vm-node-lock. Then edit the /etc/pve/storage.cfg under the vm-node-lock entry by adding nodes pve1 pve2 , save & exit. (I don't know the names of your nodes 1&2 but you get the idea). This should limit that pool to these nodes. Check in GUI under the various nodes. Then add a disk of any insignificant size, which is stored on the vm-node-lock storage to the VM. You should be good.

Good luck. Report back if successful.

EDIT: Nodes selection can probably be done in GUI too.
 
Last edited:
  • Like
Reactions: SteveITS
@gfngfn256 That does work, I didn't realize Datacenter>Storage allows picking nodes like that. I was looking at the forest, I guess, and not the trees...

This can all be done in the GUI.

To recap:
  • create new RBD storage in the existing pool
    • limit to the desired nodes
  • create a VM with a 1 GB disk in the "node lock" pool
I don't know what will happen if you tell your HA Group to use only nodes 1,3,5 and your new pool is only on 1,3. Guessing HA will at some point try and fail to migrate to node 5.

Disk doesn't need to be formatted, it works with a VM with no OS.

This also works with ZFS though it of course copies the 1GB disk to the other node.

Edit: didn't try an "unreferenced" disk
 
Last edited:
If an administrator can manually move it outside an HA pool, they can also manually ‘fix’ your hardware assignment. Most licensing model does not care about which two hosts, just no more than 2, eg. if a node dies or even the licensed parts get a warranty replacement, if you have hardware locked it (those are OEM license in Microsoft parlance and not entirely kosher for individual purchase) you would need to buy a “new one” rather than just transferring it.
 
Last edited:
Microsoft SPLA licensing unfortunately is 'on any core on which it might possibly run' so one is supposed to license all cores in a cluster. Or, "just buy Datacenter" which has unlimited VM counts.

So yes in case if multiple server outage we could (would have to) fix it or unlock it or whatever, to get it online, and then pay for additional cores that month.

Agree, OEM licensing isn't valid for this purpose (cluster hosting).
 
Be careful, a hack solution to limit cluster nodes may also not be valid for that licensing model. That is exactly what got some providers in trouble, you need to license on any system in your datacenter/cluster that it could possibly run on, even if that host is idle. The limit is not through some self-defined software loophole, the limit is defined as physical systems your company is deploying VMs to and what Microsoft may reasonably assume it could run on. The intention is that you can flexibly license and resell the license, I don’t know if you can use it internally if you’re not reselling.
 
Last edited:
That's basically what I had understood. The end of that one sentence is the relevant part to this thread I think: "License the hardware per processor for the host fabric in which an instance can run."
 
For Azure they define fabric as "a platform that not only hosts distributed applications, but also helps manage their lifecycle independently of the hosting VM". There is a legal definition you can find that gets pretty close to that description as well.

If the third node participates in any way in the lifecycle (which I would suggest is as simple as participating in the quorum for the other nodes), then you've got to license it.
 
That's basically what I had understood. The end of that one sentence is the relevant part to this thread I think: "License the hardware per processor for the host fabric in which an instance can run."
Yup, any host that the VM touches should be properly licensed. Haven't seen a contract which states I must pay for processors on nodes I'm not using (that's regular enterprise contracts, I have not experience as service provider)

I could always move VM to any physical server laying around.
 
https://blogs.vmware.com/cloud/2020/09/29/microsoft-spla-licensing-release-update/#:~:text=How will it work?,Windows licensing on and off.

http://cc.cnetcontent.com/vcs/hp/in...EAAE25A089367509064BDFDAF253_source.pdf#page2

Clearly says it’s for the cluster or “host fabric” which includes any system participating in providing network and storage, and depending on the licensing you need to license sufficient for both the number of host and potentially cores.
Would love to see a similar document mentioning newer versions. Licensing rules change from time to time.