Proxmox HA Cluster Disaster Recovery

fxandrei

Renowned Member
Jan 10, 2013
148
12
83
Hy there.
Has anyone tried to "make" a proxmox ha cluster replicate to another cluster ?
So what would be the best solution for that ?
So i would have a 3 node ha cluster, with shared nfs storage. I would also have an identical cluster in another location.
What would be the best solution to have the first cluster "replicate" in the second ? (the clusters would "see" each other through vpn).
 
In theory this should work out-of-the-box with ZFS:

Asynchronous replication on ZFS base is integrated in PVE, so it would be good if your NFS lays on a ZFS pool/filesystem. You can then replicate the ZFS to the off-site system (regularly with pve-zsync). After having storage synchronized, you need to synchronize the vm-conf files manually to the other cluster.
 
Hmm.
I was thinking of using freenas as a storage and share datasets with nfs to the proxmox cluster.
So you are saying that i can sync the storage mounted on the cluster to another storage (on another cluster) with pve-zsync ?
 
I was under the impression that you can use pve-zsync only for zfs filesystems.
So in my config i have a nfs share mounted on a cluster. So that means i have a directory that is mounted on all of the nodes, that points to the storage.
For the the nodes this is just a normal directory.
So lets say i have another cluster with its own storage mounted.
How would i use pve-zsync to replicate it ?
 
Like @LnxBil wrote:
so it would be good if your NFS lays on a ZFS pool/filesystem. You can then replicate the ZFS to the off-site system (regularly with pve-zsync)
If your NFS Storage uses ZFS: yes. Else: No.
For other FS you can use tools like rsync (or maybe your Storage has some mechanism implemented)

Jonas
 
Yes, my nfs uses zfs (it's on freenas).
What im not sure about is this.
I have clusters ClusterA and ClusterB, each with StorageA and StorageB, and nodes NodeA1, A2, A3 and NodeB1, B2, B3
So i would run the pve-zsync on NodeAx. This would sync the directory monted there (from StorageA), to the directory on NodeBx (mounted from StorageB). So StorageA does not need to directly "see" StorageB ?
So the syncing is done withougth the two storage systems to see each other ?
So how would that work ? How does the tool make the snapshots ? And where does it make them ?
 
Hi,
So i would run the pve-zsync on NodeAx. This would sync the directory monted there (from StorageA), to the directory on NodeBx (mounted from StorageB). So StorageA does not need to directly "see" StorageB ?​
IMHO: Yes. As I read it the PVE Host does the heavy lifting.
So the syncing is done withougth the two storage systems to see each other ?​
IMHO: Yes. As I read it the PVE Host does the heavy lifting.
So how would that work ? How does the tool make the snapshots ? And where does it make them ?​
ZFS :). Snapshots will be made on the ZFS Volume.

Jonas
 
Hi,
So i would run the pve-zsync on NodeAx. This would sync the directory monted there (from StorageA), to the directory on NodeBx (mounted from StorageB). So StorageA does not need to directly "see" StorageB ?​
IMHO: Yes. As I read it the PVE Host does the heavy lifting.
So the syncing is done withougth the two storage systems to see each other ?​
IMHO: Yes. As I read it the PVE Host does the heavy lifting.
So how would that work ? How does the tool make the snapshots ? And where does it make them ?​
ZFS :). Snapshots will be made on the ZFS Volume.

Jonas

So the snapshots are made on StorageAx (thats where the zfs vol\dataset is stored), from the pve-zsync executed on the NodeAx ?

Thats nice :).
 
One last thing.
When i used snapshots replication on freenas i saw that i manually had to restore the snapshots on StorageB.
If i use pve-zsync is it the same ? As in, i will see the snapshots on StorageB, but if i want to use it on ClusterB, i will have to restore the dataset manually on StorageB, and then remount the nfs share on ClusterB ?
 
This is just ordinary zfs send | zfs receive, so it's completely FreeNAS internal. You do not need Proxmox VE for this - maybe there is some FreeNAS guide to achieve this.
 
Hmmm.
So you are saying i could to replication(sending and receiving snapshots) at the storage level, and it would be the same thing ?
Because that was the whole point. You could easily send snapshots of datasets from one freenas server to another.
The problem with this is than when you want to use the datasets on the other server (StorageB) you would need to clone the last snapshot of every dataset, make the shares, and mount them on the other cluster (ClusterB).
I was hoping that pve-zsync could work directly with the mounted storage on each cluster and send the storage contents from StorageA to StorageB.
 
Hmmm.
So you are saying i could to replication(sending and receiving snapshots) at the storage level, and it would be the same thing ?
Because that was the whole point. You could easily send snapshots of datasets from one freenas server to another.
The problem with this is than when you want to use the datasets on the other server (StorageB) you would need to clone the last snapshot of every dataset, make the shares, and mount them on the other cluster (ClusterB).
I was hoping that pve-zsync could work directly with the mounted storage on each cluster and send the storage contents from StorageA to StorageB.

Yes, that's what I'm saying and pve-zsync does exactly that, but automatically. You do not need to clone before using the snapshot - that's the whole point of ZFS. Yet you do need to rollback if you want to sync again. You can also use it so switch to another server and switch back, yet they need to have the same snapshots. Besides that it is really easy.

You definetely want this in you setup, because your FreeNAS is the single point of failure in your setup, so to have 15min (or less) time delta is a very good way to do some pseudo HA.

Unfortunately there are not many HA-ZFS solutions out there that have real dual controller ZFS. I saw some articles in the BSD magazine about that but it looks quite "ghetto" and it was more a proof of concept and has little to do with state-of-the-hard dual-controller SANs that you can buy at least one decade.
 
Ok, thanks for the info.
I will try to make this setup and how well it behaves.
Thanks.
 
So it will take a while untill i finish this.
But im trying to make this clear in my head.
The only difference between freenas replication and pve-zsync is that pve-zsync automaticly restores the snaphots, and is run from the nodes, instead of directly from storage ?
 
The snapshots do not need to be restored, they're just transferred with zfs send/receive mechanism.

If you use direct storage migration, you have to sync your vm configuration manually, but the storage approach is able to transfer recursive, which is currently not supported by pve-zsync. Please read https://pve.proxmox.com/wiki/PVE-zsync
 
I've no experience with ZFS on PVE, nor with pve-zsync, but I suppose there is a misunderstanding here. AFAIK pve-zsync only works when you're using ZFS storage in your PVE node(s). TS (fxandrei) doesn't use local ZFS storage, he uses FreeNAS (as ZFS storage) and exports it with NFS for his VM images as .qcow2 or .raw files. His PVE nodes connect to this NFS share. So, PVE doesn't know about the ZFS functionality, there's only a network share.

Do I miss something?
 
  • Like
Reactions: fxandrei
Oh yes, you're right. Then there is only the manual way (total manually or with zsnapzend) and synchronizing the <vmid>.conf files with rsync.
 
I've no experience with ZFS on PVE, nor with pve-zsync, but I suppose there is a misunderstanding here. AFAIK pve-zsync only works when you're using ZFS storage in your PVE node(s). TS (fxandrei) doesn't use local ZFS storage, he uses FreeNAS (as ZFS storage) and exports it with NFS for his VM images as .qcow2 or .raw files. His PVE nodes connect to this NFS share. So, PVE doesn't know about the ZFS functionality, there's only a network share.

Do I miss something?
You did not miss one thing :)

So i guess the only way is to make dataset replication (sending snaphots) at the storage level, between the two freenas server, and when needed, manually restore (clone) the snapshots to a dataset, share them with nfs, and mount them on the new cluster.

Right ?
 
manually restore (clone) the snapshots to a dataset, share them with nfs, and mount them on the new cluster.

For the third time, no, you do not need to restore anything. After receiving (and finishing), the dataset/filesystem is synchronized with the snapshot of the source and both share the same information. In case of a hard crash of the first machine, you can directly mount the NFS from the second server (I suppose you already setup that) and use the disk. There is no difference from the first server in setup or usage. This is a "real" replication.

You only have a problem if you repaired your first box and want to reintegrate it. Then you have to roll back to snapshot on the local disk (there was write activity after snapshotting) and use that snapshot to start a replication (based on another snapshot on the second machine) in the other direction as before.

Please play with this setup to get to know it well.

Only some general advice applicable to anyone:
If you do not want to have such a "manual" or complicated setup, please consider buying a "real" SAN with multiple controllers, data and i/o paths.
 
  • Like
Reactions: fxandrei

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!