Hi,
I run a four node Proxmox cluster as follows
This isn't a traditional high availability setup, I don't need my VMs to be able to run anywhere or fallover automatically, but I can, say move my router to "node 2" if I need to take Node1 down for maintenance. (Automatic fallover wouldn't even be helpful with the router, as I can't run two sets of cables to the uplink modems.) Proxmox's backup, migration and replication features are also useful to me.
All of these machines are powered by the UPS, which runs at about 40% load, and provides an adequate window for shutdown in the event of power loss.
The problem is what happens after that. Node 1 is of course the last person standing, and any and all of the other machines and VMs may go down before it. The problem is that node1 cannot restart the router until quorum is established. The router is of course the DHCP server, and as such the network is pretty much non-functional until it comes back.
My questions therefore are:
I run a four node Proxmox cluster as follows
Code:
Node 1
VM: router
UPS client
VM: jump host
UPS server
Node 2
VM: fileserver1
VM: fileserver2
UPS client
Node 3
Staging/development node - some test VMs generally not running
UPS client
Node 4
Redundant hardware - occasionally used for testing purposes - usually offline due to significant power draw
UPS client
This isn't a traditional high availability setup, I don't need my VMs to be able to run anywhere or fallover automatically, but I can, say move my router to "node 2" if I need to take Node1 down for maintenance. (Automatic fallover wouldn't even be helpful with the router, as I can't run two sets of cables to the uplink modems.) Proxmox's backup, migration and replication features are also useful to me.
All of these machines are powered by the UPS, which runs at about 40% load, and provides an adequate window for shutdown in the event of power loss.
The problem is what happens after that. Node 1 is of course the last person standing, and any and all of the other machines and VMs may go down before it. The problem is that node1 cannot restart the router until quorum is established. The router is of course the DHCP server, and as such the network is pretty much non-functional until it comes back.
My questions therefore are:
- is there any way to override the need to have quorum before starting a VM. It is extremely unlikely this VM will be present elsewhere on the network as I move it around manually using Proxmox's migration facilities as required.
- In theory, could I give "Node1" more than half the votes in the quorum? Even if I could, this then means that other nodes won't be able to start VMs without Node 1 being up, which again, is not something I necessarily want, particularly as I could be left in a situation where Node 1 is down for maintenance, and I need to restart the router or Node 2 (where the router would normally be if I am maintaining Node 1)
- Is it possible to configure the cluster so that any 1 server can form quorum on its own? Presumably this would induce a high likelihood of splits, but as I understand it these would be resolved by randomly picking one of the nodes as authoritative. What sort of problems might I encounter in practice running a configuration like this?
- Is there any other solution that might help me to ensure that I can bring the network back online without the need for quorum? Is there some better configuration for me than a cluster if I am genuinely not overly fussed about automatic HA, but really just want to be able to move VMs from server to server from time to time?
Last edited: