[SOLVED] High Availability On Numerous Hardware Configuration

LunarMagic · Mar 27, 2024

I'm interested in configuring high availability but I couldn't find information about how it works with different hardware configurations. I have a really beefy main server, and my other servers that would be in the 3 HA node aren't as powerful.

128 Cores 4 TB of RAM
44 Cores 1 TB of RAM
88 Cores 3 TB of RAM

The 128 has more storage than the other 2 individually but when the other two clusters are together they have more than it. When VMs are moves to other nodes in the cluster, will it spread them out so that they can run optimally or do all of the nodes in the cluster have to have basically the same specs and storage?

Thank you in advance!

Dunuin · Mar 27, 2024

Information is missing. For example what storage you intend to use to make HA possible.
With ZFS it needs to save a copy of all VMs on all nodes where you want HA, that are part of the HA group.
With ceph it will usually store 3 copies on 3 different nodes if possible and you want lots of free space so when one node fails there is enough space on the two remaining nodes to compensate the loss of 1/3 of your nodes, as everything that was stored on the failed node then has to be stored on the remaining two nodes.
With NFS/iSCSI you usually have a dedicated storage server with all your storage.

LunarMagic · Mar 27, 2024

Hello, so its ZFS for all of the pools. The

44 Cores 1 TB of RAM
88 Cores 3 TB of RAM

Combined have more Hard Drive Space than the 128 Cores 4 TB of RAM machine. So based on what you are saying, all machines must have enough space to have an individual copy of every Virtual Machine? That would mean minimum every machine in the cluster must have enough space to keep the Virtual Machine on there. I just want to confirm because it seems like I need to purchase a lot more storage

damo2929 · Mar 27, 2024

LunarMagic said:
Hello, so its ZFS for all of the pools. The

44 Cores 1 TB of RAM
88 Cores 3 TB of RAM

Combined have more Hard Drive Space than the 128 Cores 4 TB of RAM machine. So based on what you are saying, all machines must have enough space to have an individual copy of every Virtual Machine? That would mean minimum every machine in the cluster must have enough space to keep the Virtual Machine on there. I just want to confirm because it seems like I need to purchase a lot more storage

basically you need to replicate the ZFS storage but it's not going to be a be real HA because of the lack of shared storage, just a failover to the last replica point.

LunarMagic · Mar 27, 2024

oh interesting, so is a real HA where I have one computer that has a whole bunch of drives and all of the other computers use it as a network drive?

damo2929 · Mar 27, 2024

so it would be using a shared storage medium for proper HA of which is real time.
so your options are
ATAoE disk ( old tech now but any system can use it and it's now effectively linux only)
NBD ( never used this so can't say how well this works)
a SAN with FC or iscsi with a cluster file system on (multiple options for that you can do iscsi with NAS units for example)
NVMEoF
An NFS share from a big pools of storage ( note the NFS server is a single point of failure) mounted to all node
a DRBD disk setup a cool tech for raid over machines.
a ceph disk setup even better than DRBD and can use erasure encoding so doesn't need multiple replicas like stated above

that means you recover to the point of the last transaction to disk on a HA event. so while they will be a loss it tiny to below.

without it's just really setting an automated RPO and RTO as your jumping back to the last RPO recovery point.

LunarMagic · Mar 27, 2024

I think a NFS Share seems like it could work for me. So does the storage share have to be one of my nodes or can it be something that's not even a node entirely? Like could this be something like an NFS share from a Windows Server or a Synology?

damo2929 · Mar 27, 2024

LunarMagic said:
I think a NFS Share seems like it could work for me. So does the storage share have to be one of my nodes or can it be something that's not even a node entirely? Like could this be something like an NFS share from a Windows Server or a Synology?

it needs to be outside the nodes. Synology / TrueNas / Linux box not part of the cluster etc

you would then map it into the cluster in the storage tab and mark it as shared and accessible on all nodes. then anything on that storage location can safely be used with HA and vm migrations will work fast and online as it only ram moving without needing to move storage.

LunarMagic · Mar 27, 2024

so just to make sure I understand 100%, a true high availability thing I can run like this:

I have all of my drives on a windows server and allow access to its storage to the nodes via an NFS share.

I can then have my 3 nodes for ProxMox empty besides for their ProxMox install. That would mean that I could leave everything else empty since only the drives on the Windows Server matter.

The way that it would work is that the nodes would spread the resources needed for the Virtual Machine, which would only be their CPU Power and their RAM. Then if one of them goes down, the other nodes then make up the difference and split the work the VMs needed and it doesn't require them to have any downtime since they are all accessing the same storage.

So as long as at least 2 of the nodes can handle the work that the Virtual Machines need in terms of CPU power and memory, the whole system can function properly.

I just want to make sure I understand this correctly. I run 100s of Linux Virtual Machines and I just want to make sure that this is accurate

milew · Mar 27, 2024

Sure, you understand this correctly. But. Can your Windows handle such disk load coming from all virtual machines. If the machine with disks gets damaged, how will you deal with that?

LunarMagic · Mar 27, 2024

I would have them in separate raids, unless I can figure out getting ZFS on Windows Server. I'd have 30 total drives separated in 3 10 drive raids (or pools if there was zfs) with 2 drives in each as redundancy. That's at least my idea about it, does that seem something practical?

VictorSTS · Mar 27, 2024

LunarMagic said:
so just to make sure I understand 100%, a true high availability thing I can run like this:

I just want to make sure I understand this correctly. I run 100s of Linux Virtual Machines and I just want to make sure that this is accurate

Sorry, you gotta be joking.... NFS server from a Windows host? Just why? There are dozens of better options for an NFS server.

Besides that, the shared storage server becomes a single point of failure: you lose your NFS server, you lose all VMs in all PVE hosts (on windows count on some downtime every month to install updates an reboot). Don't even want to ask about the network infrastructure: that must be redundant too.

DRBD isn't supported on Proxmox since v6 IIRC.

If you really have a production environment with hundreds of VMs and really require HA, recheck storage replication [1] and Ceph [2]. I suggest that you create a test cluster using virtual machines and get a glipmse about how things work. Also, you may find helpful seeking for support and consultacy services from a Proxmox Partner.

EDIT: forgot the links
[1] https://pve.proxmox.com/wiki/Storage_Replication
[2] https://pve.proxmox.com/pve-docs/chapter-pveceph.html

LunarMagic · Mar 27, 2024

So how would i be able to have HA without the Virtual Machine having to go down and back up? Is it basically one or the other? Where it has a single point of failure for maximum uptime vs more redundancy but VMs would have to go down and then back up?

VictorSTS · Mar 27, 2024

LunarMagic said:
So how would i be able to have HA without the Virtual Machine having to go down and back up?

Which event are you trying to cover exactly? Not sure if I understand your question.

UdoB · Mar 27, 2024

LunarMagic said:
So how would i be able to have HA without the Virtual Machine having to go down and back up?

The term "HA" is not always used with the same meaning, the definition is not always clear.

as long as all your nodes are up and running this does work. It is called Live-Migration and this has nothing to do with HA
if one node goes into maintenance mode and/or is cleanly shutdown this will work the same way; the processes in all VMs are not interrupted but frozen on one machine, transported and then unfrozen on the other node
if a node fails hard, e.g. by power fail or by someone pulling out all cables all VMs will vanish in the same millisecond. They cannot get live-migrated anymore because there is no source to migrate from - they are dead, Jim!
In this case the VMs tagged "HA" are started on another node automatically after one or two minutes...

All migration requires storage to be present on the target node. The classic way is to use shared storage. Another, simpler approach is to use ZFS-replication. This has drawbacks: data modified since the last replication point in time is lost.

Best regards

LunarMagic · Mar 27, 2024

Thank you to you both. I guess what I'm trying to do is achieve as maximum uptime as possible so that if a node goes down the Virtual Machine doesn't go down. It looks like that whole idea comes with cons since its like you all said its a single point of failure since its all stored on the same machine if its an NTS store.

Since that's the case based on what you said UdoB its better to have Live Migration with a node instead where all of the VMs are stored on the cluster and if one node goes down the other ones are able to boot the time the 3 nodes synced. It may take like a minute or 2 for the VMs to come back up but it protects the data much better than a single point of failure.

Am I understanding it correctly? Also thank you for everyone in the thread this has helped me learn a lot

bbgeek17 · Mar 27, 2024

LunarMagic said:
Thank you to you both. I guess what I'm trying to do is achieve as maximum uptime as possible so that if a node goes down the Virtual Machine doesn't go down.

PVE does not have a functional equivalent to VMware Fault Tolerance, today.
If the PVE node goes down, the VM will be hard reset/restarted, because there is no way to sync/recover VM memory state.

LunarMagic said:
single point of failure since its all stored on the same machine if its an NTS store.

You mean NFS store. If one is looking for truly No Single Point of Failure, then you will have PVE cluster, each node containing multiple NICs, connected to redundant switches, connected to redundant ports on NFS, where NFS is an HA cluster itself.

LunarMagic said:
Live Migration with a node instead where all of the VMs are stored on the cluster and if one node goes down the other ones are able to boot the time the 3 nodes synced. It may take like a minute or 2 for the VMs to come back up but it protects the data much better than a single point of failure.

This description is a little hard to parse. But, I think, what you described here is HA not Live Migration. They are two different things.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

UdoB · Mar 27, 2024

LunarMagic said:
so that if a node goes down the Virtual Machine doesn't go down.

This is just not possible. One specific VM is running on one specific node. If this node is killed hard then that VM (and all other processes) is gone.

(This is true for all "normal" virtualization techniques we are discussing here: running on very normal and boring hardware with Linux as the base OS. Perhaps you can buy a solution with a six digit price tag...)

Live migration is what you use for normal operation. HA kicks in when a node dies hard.

Best regards

LunarMagic · Mar 27, 2024

ohh thank you that makes a lot of sense. I think I was confused about what was said earlier. So basically what I should do is have enough storage and processing power between 2 nodes if the 3rd node they can share in running the VMs when they boot back up. Is this right?

damo2929 · Mar 28, 2024

@LunarMagic so in your environment forget the windows environment. Get 3 sas cards put them in your servers and just use ceph with erasure encoding, self auto healing will keep you going. That will allow you to have a single node failure without storage being a single point of failure. You will still need redundant switching and network bonds to be totally safe you will also require ups and 3 phase power to do it right, but if you don't have 3phase don't worry.

[SOLVED] High Availability On Numerous Hardware Configuration

New Member

Distinguished Member

New Member

Member

New Member

Member

New Member

Member

New Member

Renowned Member

New Member

Renowned Member

New Member

Renowned Member

Famous Member

New Member

Distinguished Member

Famous Member

New Member

Member