Combine local storage and network share in HA Proxmox Cluster

Glacial1222

New Member
Aug 17, 2024
3
0
1
Hi,

I've started trying to setup a Proxmox cluster to improve redundancy and uptime in my home lab. I am currently using ZFS replication at 1-minutely intervals which is nice but I don't like that it involves data loss up to a minute back.

I'm wondering if it's possible to have a combined solution involve network shared storage with failover to local storage using ZFS replication? If I understand correctly, the network shared storage will allow much quicker migration without data loss in case of a node failure but it acts as a single point of failure. If the shared storage goes down then I'd like to failover to using local storage on a given node. If possible the network storage could be replicated at a 1 minute interval to this local storage to again give at worst a 1 minute data loss.

Such an approach is still not ideal but feels like an improvement to my current setup. Can anyone comment on whether this approach is possible and advise on how I can set this up? I'm open to other alternatives also. I have considered CEPH or LINSTOR, but I have a number of mini PCs in this cluster and do not expect it to perform well and would consume a lot of network resources compared to my proposed solution.

Any advice and suggestions are much appreciated, thanks!
 
That's a lot to unpack. Here are a couple scenarios.

Storage Based

NAS/SAN Shared storage: Multiple hosts connected to shared storage, running the VMs from there. If a host goes down, the VMs come up on another host, because they actually run out of the NAS. If the NAS fails, if enough SAN nodes fail, the data is gone.

CEPH Shared Storage: Multiple hosts create ceph shared storage together, running the VMs from there. If a host goes down, the VMs come up on another host ... to the redundancy limit of the cluster. If enough hosts fail, the data is gone.


Data Protection

Proxmox Non-Shared Storage Replication: Proxmox has a native in-GUI replication feature for use with non-shared storage that can push a copy of your machine to another host in the cluster. It can do so on whatever schedule, including one minute.

And then there's backups. You know. Backups.


So take your pick. You want storage based redundancy and then pick a data protection method to add to it.
 
  • Like
Reactions: UdoB
Replication between local zfs and network storage is not possible. U can do storage migration between them, but that is not what u want. Zfs with Replikation is the Ressource optimum for storage fail over for proxmox and is so used in many Production environments.
 
I am most interested in protecting my data, and am more open to periods of lost data due to down time as long as I maintain a reliable central source of data truth. eg. I do not want to run into conflicting database issues whereby the data stored on a VM is reverted back to a minute ago but the data backed up from there to my backup server has newer data than that, so they do not match.

In that sense zfs replication seems like the right way forward. Do not host the central databases on these VMs, but treat the database on my backup server as the true state of things. If data does not make it to the backup server due to some failure, it is treated as lost (and efforts should be made to minimise these down times). If somehow a VM does not have old data that the backup database has, this can be copied over to the VM.

It sounds like what I really need is to investigate database management strategies for real time data sources and find some tools that handle these sort of data protection cases. Do you happen to have experience/suggestions for this? Thanks for your insights!
 
Replication between local zfs and network storage is not possible. U can do storage migration between them, but that is not what u want. Zfs with Replikation is the Ressource optimum for storage fail over for proxmox and is so used in many Production environments.
Replication between local zfs (in pve) and network storage - on another pve or by other os with in this case zfs also - is indeed possible.
The setup defines the possible use cases as vm's running on pve local zfs they are gone when pve is rebootet while when vm is defined to use nas (on zfs below) you got 2 cases, a) pve reboot and vm can run on other pve and b) nas reboot and vm still runs while I/O just stalling that time.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!