[SOLVED] High availability with 2 VMs

nett_hier

New Member
Mar 27, 2023
18
0
1
I want a highly available OPNsense setup in my Proxmox cluster. OPNsense has native HA features, but they require some manual tweaking to set up and I don't want to run a separate OPNsense VM for each node in my cluster.
So I had the idea of just setting up two VMs on separate nodes and have those configured with OPNsense's own HA, meaning if the primary one goes down the fallback automatically starts working and so on.
I now want to integrate this with Proxmox HA such that should a node with one of those VMs go down, that VM will be restarted on a different node, i.e. regular HA.
However, I don't want Proxmox to start the new instance on the same cluster node as where the other OPNsense VM is currently running for obvious reasons.
Is there any way to achieve this? I.e. configure HA for 2 separate VMs such that they avoid each starting on the same node?
 
Last edited:
https://pve.proxmox.com/wiki/High_Availability

Code:
Groups
The HA group configuration file /etc/pve/ha/groups.cfg is used to define groups of cluster
nodes. A resource can be restricted to run only on the members of such group. A group
configuration look like this:

group: <group>
       nodes <node_list>
       <property> <value>



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: floh8
Could you maybe elaborate a bit on the solution you're proposing?
If I set up a single group containing the whole cluster and assign both VMs to it, I would assume that the VMs would still happily start on the same node.
If I set up two disjoint groups with 'restricted 0' and assign each one VM, it would kind of work until one of the groups is completely offline, in which case the VM would have to switch to the other group and would possibly start on a node where the other VM is currently running.

EDIT: Just found this line in the docs you linked: "If there are more nodes in the highest priority class, the services will get distributed to those nodes.". So does this mean that the first assumption I made is incorrect? Will VMs assigned to the same group generally prefer to spread?
 
Last edited:
u build a group of node 1 +3 and a second group of node 2 +3. Then add the vms to one of the groups. U can do it in the gui as well.
 
How would that work with more than 3 nodes though? Wouldn't the issue I described in the second case with two disjoint groups occur?
 
What about node 1+3+5+7+... for OPNsense VM A and node 2+4+6+8+... for OPNsense VM B? Then a lot of nodes would need to fail at the same time.
 
Hm, thinking about it the cluster quorum would fail before that issue would ever happen lol.
Since it's always an odd number of nodes I'd set it up like @floh8 described it, i.e. with (n-1)/2 nodes in each group with the odd one out being shared between both groups with a low priority.
 
Just keep in mind that double clustering without each one aware of what the other one is doing can lead to unpredictable results.

If node1/vm1 fails and starts transferring to node2/vm1, but app level HA moves the resources to node3/vm2, will vm1 (which may have state saying its primary) and vm2, which has moved on transaction wise, be able to duke it out? May be.
A lot in this HA scenario will depend on timing/time outs.
I am not familiar with what opsense does for HA, but adding some sort of quorum on 3rd node may be required for stable operations.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thanks for the heads up.
I think OPNsense should be able to handle it, but I'll experiment with failure scenarios and see what happens.
 
normally u dont need such double ha solution. application ha is already better and should be enough. A node failure is very rare and when it really happens then you have to take care as good admin that this node will be go live again in a short time. Even i think in small enviroments its enough to use the virtualization ha and save the expense for a second vm.
 
Last edited:
Eh, you're probably right, especially at the current cluster size. But since Proxmox HA seems simple enough to set up I'd at least like to try if it works cleanly, in case I ever scale to a point in the future where I need to be able to tolerate two node failures.
 
normally u dont need such double ha solution. application ha is already better and should be enough. A node failure is very rare and when it really happens then you have to take care as good admin that this node will be go live again in a short time. Even i think in small enviroments its enough to use the virtualization ha and save the expense for a second vm.
Regarding your edit, by virtualization HA you mean Proxmox HA? The issue with that would be that in the event of a failover the in-memory state would be lost, meaning all active connections would be dropped, which I don't want.
OPNsense's built-in HA includes pfsync meaning the state stays synchronized and a failover is seamless.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!