ZFS Drive failed HA didnt migrate

N0_Klu3

Well-Known Member
Mar 8, 2020
37
11
48
40
Hi there,

I have a 3 node PVE cluster with a single ZFS drive on all 3.
I setup replication to run every 2 hours between all 3 nodes.

Today I had a ZFS drive on node1 die, instead of the ct/vm's migrating to other nodes they all just failed.

What is the best way to get them back up and running as their storage is available on the other 2 nodes but I cannot migrate them.

Yes the storage might be an hour or so behind but I can live with that.

Unless I'm missing something, whats the point of replication if HA doesn't kick in?
OR at least allow me to migrate/start them on another node?

Alternate question, would it be better to put ZFS mirror (boot and storage) rather than just a separate boot, and separate ZFS storage?
Next question after this, DRAM-less for ZFS or not?
 
Unless I'm missing something, whats the point of replication if HA doesn't kick in?
HA will migrate Resources when one node is not available anymore. The mechanism behind is "corosync". This mechanism does not watch storage. This leads to the problematic behavior you observed: when "only" storage fails the respective VMs do not migrate.

Alternate question, would it be better to put ZFS mirror (boot and storage) rather than just a separate boot, and separate ZFS storage?
Yes! Always!

ZFS has the capability of "self-healing". For this to function some kind of redundancy has to be present. While RaidZ1/2/3 exists mirrors are recommended to be used in virtualization context.

Actually I drop the classic idea to separate OS+data for redundancy. At least in my Homelab and because of "MiniPC"...

OR at least allow me to migrate/start them on another node?
Actually I do not know the official answer. What I would do is: make sure the dysfunctional VM is actually stopped. Kill it, if necessary. Then move the config file from the problem node to a good one. Something like ~# mv /etc/pve/nodes/<oldnode>/qemu-server/<vmid>.conf /etc/pve/nodes/<newnode>/qemu-server/. Note that you can not copy that file inside /etc/pve/nodes.

Next question after this, DRAM-less for ZFS or not?
All Solid-State drives should be "Enterprise Class" with PLP. This has more than one reason, discussed here multiple times.
 
Last edited:
HA will migrate Resources when one node is not available anymore. The mechanism behind is "corosync". This mechanism does not watch storage. This leads to the problematic behavior you observed: when "only" storage fails the respective VMs do not migrate.


Yes! Always!

ZFS has the capability of "self-healing". For this to function some kind of redundancy has to be present. While RaidZ1/2/3 exists mirrors are recommended to be used in virtualization context.

Actually I drop the classic idea to separate OS+data for redundancy. At least in my Homelab and because of "MiniPC"...


Actually I do not know the official answer. What I would do is: make sure the dysfunctional VM is actually stopped. Kill it, if necessary. Then move the config file from the problem node to a good one. Something like ~# mv /etc/pve/nodes/<oldnode>/qemu-server/<vmid>.conf /etc/pve/nodes/<newnode>/qemu-server/. Note that you can not copy that file inside /etc/pve/nodes.


All Solid-State drives should be "Enterprise Class" with PLP. This has more than one reason, discussed here multiple times.

Thanks, I know about PLP, and I did try to buy a Micron 7400 Pro, which turned out to be fake.
My issue is I only have 2280 slot and 2 of them in my MiniPC.

I will get 2x 2TB on each Node and run ZFS mirror.