Suggestions for low cost HA production setup in small company

Skimming through that page, I am surprised there is no example using Linux kernel bridges for STP meshing,
There is RSTP [1]

Running a meshing or routing daemon just adds another point of failure.
Maybe, but it does allow to use both links simultaneously while on RTSP only one is in use and the other is fallback only.

Either way, that page requires an additional high speed NIC on each node to do the connections to the other neighbor node.
Which you should have anyway, connected to two switches with MLAG/stacking to avoid the network being an SPOF. But yes, you would need 4 nics per host, two for the MESH + two for the "lan".

[1] https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#RSTP_Loop_Setup
 
  • Like
Reactions: Johannes S
In your case, the best solution to cut the costs while keeping the system easy to manage is the solution B:

2 PVE nodes with ZFS replication + QDevice

Notes:
  • There are many tutorials for this setup on the web.
  • You can run the corosync-qdevice on your PBS
  • Stay away from CEPH is you cannot afford to run 4 nodes.
 
  • Like
Reactions: UdoB
For limited budget consideration, the Option B: 2-node + QDevice with ZFS replication may is your low cost choose. but you need will designed the qdevice connectivity with 2-nodes to avoid if any network device down, let your 2-node + QDevice will stop communication with each other!
If you want higher available, then you may considering Option C: 3-node with Ceph, that will provide batter VM data consistent then Option B! Because you don't need to tolerate the ZFS replication caused data gap between replication interval.
 
  • Like
Reactions: Johannes S
  • You can run the corosync-qdevice on your PBS

But then the backup server can be reached via ssh (without entering a password) from the cluster nodes. It's never a good thing that your backup can be accessed without authentification from the host you want to backup.

There is however a good way to work around this, described by @aaron here:

2x PVE nodes with local ZFS storage (same name)
1x PBS + PVE side by side bare metal.

The 2x PVE nodes are clustered. To be able to use HA I make sure that the VMs all have the Replication enabled. For Mailservers and other VMs where any data loss is painful, I replicate with the shortest possible interval of 1 minute. Other VMs, like a DNS server, are replicated with longer intervals.

On the PBS server I have one LXC container running which is providing the external part of the QDevice, so that the 2x PVE nodes get their 3rd vote and can handle HA and downtimes of one node.



Imho the ProxmoxVE documentation should be updated that such a setup (maybe with a VM for qdevice to have even stricter isolation) is considered best practice for small clusters.
 
  • Like
Reactions: UdoB and david_tao
ZFS replication isnt a replacement for proper shared storage as it is asynchronous. a small cluster should be 2 real nodes, 1 quorum, and external shared storage (block or file)
That wasn't my point. My point was that it's not a good idea to allow ssh login as root on the backup server from the cluster. But if you add the PBS as qdevice this is what you will get. Thus I think that Aarons setup ( having both PBS+PVE baremetal on the Backup Host and use a vm or lxc as qdevice) should be considered as beat practice if one want's to utilise the backup server as qdevice.

The choosen storage and it's pros or cons have nothing to do with it
 
Last edited:

Yes, there are enough users running with three nodes an be happy.

From my own experience in my homelab (last year) I learned that I want to be able to lose one node and have Cephs self healing capabilities do its job. To return to "everything's green" you need three nodes alive.

Again: I am not a Ceph expert and I tend to be a little paranoid. Some other findings are there: https://forum.proxmox.com/threads/fabu-can-i-use-ceph-in-a-_very_-small-cluster.159671/ -- with the result that I dropped Ceph and went back to ZFS only...

As always: YMMV :-)
 
  • Like
Reactions: Johannes S