[SOLVED] HA with ZFS

kenneth_vkd

Well-Known Member
Sep 13, 2017
39
3
48
31
Hi
We are looking into hardening our infrastructure to better handle outages due to network or hardware failtures.
We use OVH dedicated physical servers for our infrastructure and currently have 3 node PVE cluster. This cluster is currently configured with 2x4TB nVME drives per node, with one of the nodes restricted to primarily Windows VMs and therefore it has only 1 CPU to bring down the licensing costs due to a low volume of clients requesting services that have to run on Windows.
All servers have access to the OVH vRack system.

Currently all servers have the disks configured as a ZFS mirror, but we are looking into configuration of HA in Proxmox. We can however see that HA works best (only?) with shared storage (network share) or distributed storage (Ceph).
As our monitoring of storage health is currently based on the output from emails generated zfs-zed, we would prefer not to redo our tooling and was therefore thinking of alternatives to handling this.

We therefore have the following questions, which we hope that someone can help us find the answers for:
- Can you implement HA by enabling replication of the required VMs between nodes in the same HA group and still benefit from the online/live migration or do we need to implement something like Ceph?
- If we have to implement Ceph, can it then be done on top of the ZFS pools for us to keep current storage monitoring tools or would we need to bring up Ceph directly on the bare disks and implement new tooling for storage health monitoring?
- If necessary to bring up Ceph to get wokring HA, can we bring up new nodes in our existing cluster and only deploy Ceph across the new nodes and then migrate VMs from current non-HA setup in same cluster?
 
Hi,
We therefore have the following questions, which we hope that someone can help us find the answers for:
- Can you implement HA by enabling replication of the required VMs between nodes in the same HA group and still benefit from the online/live migration or do we need to implement something like Ceph?
Shared storage is better, but HA (and online migration with replicated disks) also works with replicated ZFS nowadays (qemu-server >= 6.1-9, pve-ha-manager >= 3.1-1). In case of a node failure, the data since the last replication will be lost, so it's best to choose a tight enough replication schedule.

- If we have to implement Ceph, can it then be done on top of the ZFS pools for us to keep current storage monitoring tools or would we need to bring up Ceph directly on the bare disks and implement new tooling for storage health monitoring?
No, Ceph needs control over its disks.

- If necessary to bring up Ceph to get wokring HA, can we bring up new nodes in our existing cluster and only deploy Ceph across the new nodes and then migrate VMs from current non-HA setup in same cluster?
Yes, you can configure a HA groups to ensure that VMs are only migrated to where the shared storage is actually available.
 
Thank you for the reply
Are there any recommended tools that can help monitor disk health when Ceph has control over the disks?
With zfs-zed, our system-administrators get a notification when a disk failure is detected
 
See here and here. I'm not sure there's anything for email notifications out of the box, but a simple script checking the cluster health might do the job.
 
Thank you for the replies
We will be looking in to how we can use Ceph. Some initial testing shows that it gives the necessary HA, so we just need to figure out the monitoring of disks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!