[SOLVED] Is it possible to make failover with 2 nodes and zfs

fxandrei

Renowned Member
Jan 10, 2013
146
12
83
So i have 2 servers, with proxmox 5.4 installed, with zfs disks, and configured in a cluster.
They are identical.
One very important thing is that i have no shared storage. So the vms run locally (as is, they have local disks.... so no live migration).

So i have some vms on node1.
From what i know i can replicate the disks from node1 on node2 (because of zfs) and if node1 fails i can manually restart them on node2.
Is there a way to have auto failover ?

So for me this scenario would work:
- have the disks replicated from node1 to node2
- if node1 fails the vms should start on node2 with the replicated disks (even if they are not the latest disks).
- manually revert back node1 when possible

Anyone ever had this scenario ?
 
So i have 2 servers, with proxmox 5.4 installed, with zfs disks, and configured in a cluster.
They are identical.
One very important thing is that i have no shared storage. So the vms run locally (as is, they have local disks.... so no live migration).

So i have some vms on node1.
From what i know i can replicate the disks from node1 on node2 (because of zfs) and if node1 fails i can manually restart them on node2.
Is there a way to have auto failover ?

So for me this scenario would work:
- have the disks replicated from node1 to node2
- if node1 fails the vms should start on node2 with the replicated disks (even if they are not the latest disks).
- manually revert back node1 when possible

Anyone ever had this scenario ?


That's possible, but not in a 2-node cluster. If in a 2-node cluster 1 one fails also the other is not operable (quorum missed). If you add a 3rd node the requested scenario will work - HA has to be configured.
 
So i have 2 servers, with proxmox 5.4 installed, with zfs disks, and configured in a cluster.
They are identical.
One very important thing is that i have no shared storage. So the vms run locally (as is, they have local disks.... so no live migration).

So i have some vms on node1.
From what i know i can replicate the disks from node1 on node2 (because of zfs) and if node1 fails i can manually restart them on node2.
Is there a way to have auto failover ?

So for me this scenario would work:
- have the disks replicated from node1 to node2
- if node1 fails the vms should start on node2 with the replicated disks (even if they are not the latest disks).
- manually revert back node1 when possible

Anyone ever had this scenario ?
I kept getting asked this, as we only have two machines, the short answer as above it no. You need to have odd voting, so in theory you could have a node with 2 votes and one with 1, but this will mean it will never really failover (unless you then edit the votes after the host goes down)

It's easier to have a small 3rd node (NUC or other small PC), just powerfull enough to run Proxmox, but not host any machines, that would give you the 3rd vote, and enable you to have failover.
 
This will "fix" just the votes problem (maybe i can have a vm with proxmox on another server).
But what about the fact that i dont have a shared storage?
 
This will "fix" just the votes problem (maybe i can have a vm with proxmox on another server).
But what about the fact that i dont have a shared storage?
Could you not use the ZFS send in proxmox and send snapshots every 5 minutes?
 
Well this is not clear to me.
If i add another node to the cluster, i can then make a ha group consisting of the "real" nodes right ?!
Then i go and replicate the disks from node a to node b (the real nodes).
If nodeA fails, the will get restarted on nodeB ?
 
So this seems to be working.
I made another vm on another server that was in the same network as the 2 real nodes.
I added this vm to the cluster, and the made a ha group consisting of the 2 real nodes.
The zfs replication was on for each vm

I then moved one vm on node2, and went ahead and force-reset it.
I saw that the vm that was on node2 migrated to node2 and started.

Another thing that i observed is the fact that if migrate a vm to one node, its replication setting changes so that it replicates to the node it came from.

This is pretty nice.
A HA cluster with 2 nodes and no shared storage.

One thing to mention is the fact that you cannot live migrate a vm from the GUI, and this quote from the proxmox wiki :
recovery works, but there may be some data loss between the last synced time and the time a node failed.
So if the vms used change a lot, well, you probably should try to use shared storage.


Other than that i think this setup is pretty solid.
 
I only have 2 real hosts with 1 hard drive each. Is it possible to somehow create a cluster HA ?? (I have a third physical team where I can create the virtual machine but it is for personal use).

I know it will always be advisable to have more nodes. I need to do HA for an MV with 2 physical hosts (these are the resources that the company gave me). If I manage to make this work, I can show the system running and request more resources for the project (more nodes), that's why it is so important to achieve it in some way.
 
If you cannot create the third virtual node anywhere (like i did), im not aware of any solution to this.
 
Thanks for your answers

Now I have Proxmox in 2 physical hosts (500 GB each) and I have created one more Proxmox but in a virtual machine (50 GB), this virtual machine is in a third physical host.

I have included them in a cluster, my question now is...

1.- How can I have the shared storage between the nodes if each node has only one hard disk?

2.- The virtual node will only be supportive in the cluster but in case of falling nodes, I just want the virtual machine to move between nodes 1 and 2 (physical). I know that if a node fails, the cluster will still work, but what happens if I lose two nodes?
 
1. You dont need shared storage to have failover. You just need to use zfs as your filesystem, and them configure replication on the vm (have it replicate on the other node in the cluster.

2. Im not sure what you are asking here. What nodes are you talking about ? The 2 physical nodes ?!
So your cluster will work if one of the physical nodes is still alive, even if you will probably need to start them manually, in case the 3rd node is already dead.
The important thing is to have zfs replication enabled on every vm you have.
 
Last edited:
  • Like
Reactions: alexcolin
1. You dont need shared storage to have failover. You just need to use zfs as your filesystem, and them configure replication on the vm (have it replicate on the other node in the cluster.

2. Im not sure what you are asking here. What nodes are you talking about ? The 2 physical nodes ?!
So your cluster will work if one of the physical nodes is still alive, even if you will probably need to start them manually, in case the 3rd node is already dead.
The important thing is to have zfs replication enabled on every vm you have.

Thanks for the support.

Then I am getting close to having the functional cluster.

One more doubt and I regret my ignorance.

How is ZFS replication enabled? Will they have a manual on how to do this?

Regards.
 
You just select "Datacenter" and then select "Replication".
Over there you need to add an entry for each vm, and specify the target node, and how often to have it replicate.
So lets say you have vm1 on node1.
You will set its id, and node2 as its target.
If you migrate that vm manually to node2 the target will automatically change to node1.
 
You just select "Datacenter" and then select "Replication".
Over there you need to add an entry for each vm, and specify the target node, and how often to have it replicate.
So lets say you have vm1 on node1.
You will set its id, and node2 as its target.
If you migrate that vm manually to node2 the target will automatically change to node1.

Following the steps and selecting "Replication" generates the following error message:

missing replicate feature on volume 'local-lvm: vm-100-disk-0' (500)
 
So is your filesystem ZFS ?
When you install the nodes you should not use ext4. You need to use ZFS.

If you go to a node, then select disks-ZFS, you should see a zfs pool.
 
Last edited:
  • Like
Reactions: alexcolin
So is your filesystem ZFS ?
When you install the nodes you should not use ext4. You need to use ZFS.

If you go to a node, then select disks-ZFS, you should see a zfs pool.

Excellent thanks! :)

fxandrei your support has helped me a lot!

In this event I will proceed to reinstall with ZFS. One last question ... By instilling again and choosing ZFS allows us to choose different types of RAID. If I only have 1 Hard Disk on each node, should I configure RAID0 or even if no more disks appear, should I choose RAID1?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!