HA light or for the poor

TErxleben

Renowned Member
Oct 20, 2008
233
15
83
I have networks in which I'd like to make individual VMs (e.g., pihole/ssh/guacamole) highly available. The network contains several PVE hosts. Does anyone have an idea, or has anyone even done it, how I can accomplish this most elegantly without tying myself to a PVE cluster? Especially since, in my case, there are networks with only two PVE hosts. So far, I've been pulling backup files daily from a central storage location to every possible PVE target host. In the event of a failure, I manually restore a backup and start the VM. I'd like to automate this last step, all without the overhead that a cluster causes.
 
  • Like
Reactions: FrankList80
Your approach sounds reasonable and resilient. Every "central" component you add, adds also another point of failure (PoF). You could "tweak" your CMO with some (!) decentralized shared storage (again: also PoF) but this is then a game of probailities. I really like your CMO.
 
make individual VMs (e.g., pihole/ssh/guacamole) highly available.
Ignore PVE and examine every service individually.

I have three piholes. One of them is the hot one, this one is reachable by a "floating IP address". If this one dies then another pihole will take over that IP address. No client will notice that this is now another instance. I utilize "keepalived" for this.

(( No I won't publish my specific solution as it is below my own quality standards... although it works... for me. But there are (better!) descriptions out there in the wild Internet. ))

The same for Guacamole: you have one Guacamole instance? Why not manually duplicate it, use the primary one and use the secondary if the first one is down?

Not sure what "ssh" means for you. If it is an entry to your lan (like a jumphost) then the same logic applies: why not run a ssh-server on multiple independent machines? Without a floating IP if you (your users?) are clever enough to switch to a secondary instance if the primary fails or with a floating IP if that might be a problem.

Of course a generic solution like PVE+Cluster+HA is easier to maintain on the long run :-)
 
Ignore PVE and examine every service individually.

I have three piholes. One of them is the hot one, this one is reachable by a "floating IP address". If this one dies then another pihole will take over that IP address. No client will notice that this is now another instance. I utilize "keepalived" for this.

(( No I won't publish my specific solution as it is below my own quality standards... although it works... for me. But there are (better!) descriptions out there in the wild Internet. ))

The same for Guacamole: you have one Guacamole instance? Why not manually duplicate it, use the primary one and use the secondary if the first one is down?

Not sure what "ssh" means for you. If it is an entry to your lan (like a jumphost) then the same logic applies: why not run a ssh-server on multiple independent machines? Without a floating IP if you (your users?) are clever enough to switch to a secondary instance if the primary fails or with a floating IP if that might be a problem.

Of course a generic solution like PVE+Cluster+HA is easier to maintain on the long run :-)
Dear Udo,
That's what I'm currently working on. Providing small but important VMs redundantly on existing PvE hosts, which would still be available if the Proxmox cluster functionality fails. It starts very rudimentarily with ensuring external SSH access via failover. For example, maintaining a minimal SSH VM on several PvE hosts to ensure exactly that. Now I'm solving this with a Raspberry Pi that provides this access. But if that thing crashes, the SSH access is lost. At the same time, I have at least two PvE hosts with which I could create minimal redundancy. But if one cluster doesn't work, SSH won't work either. In short, I want to make an SSH VM parallel available on all available hosts without having to rely on a functioning PvE cluster.
 
Last edited:
Your approach sounds reasonable and resilient. Every "central" component you add, adds also another point of failure (PoF). You could "tweak" your CMO with some (!) decentralized shared storage (again: also PoF) but this is then a game of probailities. I really like your CMO.
The only central point is the data hub, where important VMs are backed up and retrieved. If this fails (PoF), a timely response is possible, as the relevant VMs have long since been copied to the corresponding hosts.