Is clustering production-ready on Proxmox VE 8.2.2?

GP123

Member
Dec 20, 2021
15
1
8
35
I have 8 Proxmox VE 8.2.2 hosts. Each host is built the exact same way, running the same hardware. Besides the pro of everything being accessible from one clean GUI and VM moves being a breeze (compared to my current PBS backup and restore method), what other potential pros would I gain from clustering? What cons are there?

  • Scaling stability - If I add more Proxmox hosts in the future, is there the risk of losing GUI access, or worse, some causing some negative affect to my VMs if something goes wrong?
  • If something does go wrong and I lose quorum, is it possible to recovery without needing to rely on backups? Are there any success stories out there of this?
  • I don't have shared storage, I use ZFS.
  • I am not planning on using Ceph directly with Proxmox.
 
Last edited:
I don't have shared storage, I use ZFS.
Managing ZFS replication with multiple nodes is a PITA and the pain scales with the number of nodes and machines. Just go with shared storage. This will yield a much better cluster experience. For me, a "real" cluster requires a (dedicated or distributed) shared storage, everything else is just headaches and disaster to happen. Otherwise, you will only have a single pane of glass of a bunch of nodes that do not perform like a "real" cluster.

You will also have data loss on each VM that runs on a node that crashes or the ZFS is something corrupted by hardware failure or human error. ZFS replication is asynchronous and the probability is non-zero for dataloss. If you can't tolerate that, it's not the right tool for the job.

If something does go wrong and I lose quorum, is it possible to recovery without needing to rely on backups?
Sure, If you have the data only on 3 nodes and those 3 fail hard, you cannot recover the data from the disks, it's gone and you need to restore from backup. How probable that is ... I don't know and depends on a lot of factors.


And to answer your title question:

Not with ZFS replication on multipe nodes. I would not consider that production ready, because it is not bullet proof and maybe never will. There is so much configuration involved that you can make a lot of errors as an administrator and that for EACH VM. Having a single shared store, there is no configuration involved, once setup everything works just like it should.
 
  • Like
Reactions: UdoB and blanchet
Don't use 8.2.2. You will have problem install another host with this version. Use last PVE version, now 8.4, or go directly to 9.
 
  • Like
Reactions: UdoB
Managing ZFS replication with multiple nodes is a PITA and the pain scales with the number of nodes and machines.
Yes, you are (or course) correct. Note that I do admit that although I am using ZFS exclusively - the few clusters I have are small.

I have 8 Proxmox VE 8.2.2 hosts.
With that number of nodes ZFS-replication (over all nodes) would be a bad choice. And partial replication -let's say in two groups of four nodes- introduces artificial borders which will lead to confusion earlier or later.

On the other hand reliable, redundant (network+storage) distributed storage does not come for free. From my point of view the entry hurdle ich much higher, both complexity-wise and monetary. Probably the actual use-case and tolerance to lose data of the last replication interval must be considered...
 
  • Like
Reactions: Johannes S