HA with ZFS chooses wrong node

I have been testing the (very exciting) PVE 6.3 capability to do live migration with ZFS replication. I was assuming that upon an automatic migration for reboot (cluster Options setting to migrate and reboot node), the cluster would pick the PVE host where replication is configured to go as a target. However, this does not seem to be the case. Replication picked the first node in the HA group list, and that node is not the ZFS replication target.

I also tested with a simulated PVE host failure (power down). The VM migrated to another host, not the ZFS replication target, and was unable to start.

Manual migration works fine. Is there a way to make this process more ZFS target aware?
 
Last edited:
How many nodes do you have in the cluster? If you have 3 nodes for example, you can setup two replication jobs to replicate to all nodes.

Otherwise, you will have to define HA Groups and limit the hosts to the ones involved in the replication.
 
Three nodes in my lab cluster. I've defined a specific HA group for just this VM, and tested both reboot and power off. This worked as expected, except that I had to wait for the ZFS replication to complete prior to migration, otherwise the migration would fail. I think this is a reasonable expectation - thank you.
 
That is still in there? Thanks for the hint as that is not valid anymore. Just tested it to be sure and a live migration works, as well as during a recovery situation.

I sent a patch for the documentation to change this section.