3-node Proxmox/Ceph cluster - how to automatically distribute VMs among nodes?

victorhooi

Well-Known Member
Apr 3, 2018
250
20
58
38
We have setup a 3-node Proxmox cluster, with Ceph storage on the same nodes.

The plan is to use this for a HA cluster.

My question is around distributing VMs among the three nodes.

What is the correct way of setting this up in Proxmox, such that new VMs spun up on any node are automatically set as HA? And also that VMs are redistributed among the nodes as appropriate based on load?

Also - if all our VMs are being spun up from templates, what is the correct way of distributing these? Is it fine to simply copy the same templates to all three nodes, using local (LVM) storage? Or should we be looking at something like CephFS?
 
  • Like
Reactions: bvdl
What is the correct way of setting this up in Proxmox, such that new VMs spun up on any node are automatically set as HA? And also that VMs are redistributed among the nodes as appropriate based on load?

No. I'm waiting for this feature for about 2 years now :)

Also - if all our VMs are being spun up from templates, what is the correct way of distributing these? Is it fine to simply copy the same templates to all three nodes, using local (LVM) storage? Or should we be looking at something like CephFS?

I use a shared NFS storage for templates and ISOs
 
Wait - are you saying that users need to manually distribute our VMs across the nodes?

As in, they need to check each cluster member individually, look at the system statistics, and work out which one to spin up a new VM on? That seems...odd?

and got it - shared NFS works. As in, are you saying a separate NFS server, or do you mean somehow re-shared from the same Proxmox host?
 
No, HA on proxmox does only "VM ping>VM Dead>Move to the next available node based on HA priority", but not based on load...it'd be very neat tho!

(you could actually do that with crontab and the pve shell, but its a bit..hackish)
 
Wait - are you saying that users need to manually distribute our VMs across the nodes?

As in, they need to check each cluster member individually, look at the system statistics, and work out which one to spin up a new VM on?

Exactly.

That seems...odd?

It is ...

and got it - shared NFS works. As in, are you saying a separate NFS server, or do you mean somehow re-shared from the same Proxmox host?

I use Qnap or Synology devices for this.
 
VMware DRS. I use the proxmox APi to kind of do this.
But all the records of the VM are in an external system when it adds a new VM it works out the best host to use.
Every so often it checks where they are as maybe someone manually migrated or some failure occurs.
I can also check and move some to try and balance load. But I'm scared of that so run that process manually as of now
 
Hi

You will need to add shared storage using something like NFS or iSCSI plus shared networking. We use Open vSwitch to distribute the networking so any VM can run on any Node.

I wrote a DRS package which monitors the load on each Node and then migrates VMs off the busy node onto the quieter nodes.

You can check the DRS package at https://gitlab.com/tokalanz

Hope this helps.
 
Hi

You will need to add shared storage using something like NFS or iSCSI plus shared networking. We use Open vSwitch to distribute the networking so any VM can run on any Node.

I wrote a DRS package which monitors the load on each Node and then migrates VMs off the busy node onto the quieter nodes.

You can check the DRS package at https://gitlab.com/tokalanz

Hope this helps.

Hi tokala

Just had a look at your mini projects they look pretty interesting.
Are you officially contributing to Proxmox or are these just independent projects?

""Cheers
G
 
Hi

I have been working on the project for a while as an independent project. I haven't thought about contributing to Proxmox until you asked.
 
  • Like
Reactions: bvdl and velocity08
@tokala Thanks for your project. I'm currently looking to implement this is proxmox (but with perl, as proxmox is perl code).

I have fast looked your code, do you manage nodes with différents kind of cpu (frequencies, or number or cores ?).
I wonder how to manage that ? (maybe use bogomips are reference ?)
 
Hi

I run my DRS application off the Proxmox nodes so, if the node controlling DRS is down it still continues to work. I used a separate management VM which does a few things including running DRS. I am also working on a containerised version to make it easier to run.

The nodes I use are all blades so the CPU are the same.
You could use the load average as a base which would take into account CPU/Memory etc but I would suggest using a wider band for selecting if a node is working hard or slacking off.

/nodes/<node>/status/loadavg

or you could use cpuinfo to work out a factor for the CPU of each node

/nodes/<node>/status/cpuinfo
 
  • Like
Reactions: bvdl
@tokala Do you think this could be integrated upstream to Proxmox?

Also - you mentioned above about "shared networking" - what does that mean?

We use Open vSwitch to distribute the networking so any VM can run on any Node.
 
@victorhooi How do you mean integrated upstream to Proxomx?

We use Open vSwitch because we have many vLANs the VMs could be connected to. All vLANs are trunked to Proxmox and all nodes are connected together using Open vSwitch as a virtual distributed switch.
All VMs use vmbr0 and then the vLAN is specified in the Proxmox VM configuration. When a VM is migrated to another node, the networking does not need to change and the vLAN is present on the destination node.
 
Hi

You will need to add shared storage using something like NFS or iSCSI plus shared networking. We use Open vSwitch to distribute the networking so any VM can run on any Node.

I wrote a DRS package which monitors the load on each Node and then migrates VMs off the busy node onto the quieter nodes.

You can check the DRS package at https://gitlab.com/tokalanz

Hope this helps.

This is great!

Looks like I got it working on my cluster, thanks for sharing this!
 
I'm also working on a drs feature, currently beta. I'll opensource it soon. (full perl with proxmox lib with some advanced algorithm)


Interesting! Let me know if you want someone to test,
I have a cluster with 1,5 TB of ram and 48 cpu cores running only non-production workloads.
 
@spirit - Any word on open-sourcing your DRS extension for Proxmox? It sounds exciting, would love to see it, even if it early stages.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!