Replacing all drives on clustered nodes

kriev98

New Member
Feb 14, 2025
11
1
3
Hey,

I have 2 nodes (identical config) in a cluster with HA (QDevice for 3rd vote) on a ZFS Pool. I need to upgrade all drives (we are doubling the capacity). How should i approach this. All VMs are backed up to a PBS server as well.

Is it possible to move everything to NodeB, disable HA on VMs (force them to stay on NodeB) delete ZFS Pool on NodeA, replace drives, recreate pool on NodeA, reanable HA, push everything over NodeA and repeat on NodeB?
 
The hardware is in a datacenter, i will not be doing the manipulations nor will i have contact with the person doing it. So the server will go down and once back up, the new drives will be in (8 bay server, 2 NVMes (boot drives, won't be touched) and 6 SSDs, no hardware raid, will be replaced at once). I just need to make sure how to prep the system before that shutdown to just recreate the ZFS pool with the same name. The only issue i see is that the pool on NodeA will be double the size than NodeB, will that be an issue when reenabling HA and bringing back all VMs to NodeA?
 
The hardware is in a datacenter,

What I would do in beforehand is to setup a virtual cluster with similar topology (but much smaller) devices. Then one can simulate/train such a delicate operation.

The bad news: to create such a thing may be a complex task in itself. But either a virtual test-cluster or a real hardware test-cluster is really useful and makes you sleep better because you can train complex tasks.

Of course: ymmv ;-)
 
  • Like
Reactions: mariol
The only issue i see is that the pool on NodeA will be double the size than NodeB, will that be an issue when reenabling HA and bringing back all VMs to NodeA?
The size doesn't matter. The important thing is that the new pool has the same name. After you have moved/backed up the VMs, it is the best to remove everything that refers to the old (still active) Zpool. Then delete the Zpool. Once the new drives have been installed, you can recreate the new Zpool from the WebUI.

"Node → Disks → ZFS: Create ZFS" [0]

Choose your raid level and drives, remove "Add Storage" if you would like a different dataset later, and go on "Create".

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_create_new_zpool
 
The size doesn't matter. The important thing is that the new pool has the same name. After you have moved/backed up the VMs, it is the best to remove everything that refers to the old (still active) Zpool. Then delete the Zpool. Once the new drives have been installed, you can recreate the new Zpool from the WebUI.

"Node → Disks → ZFS: Create ZFS" [0]

Choose your raid level and drives, remove "Add Storage" if you would like a different dataset later, and go on "Create".

[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_create_new_zpool

ok so am i doing this the right way?:

I migrate everything to NodeB
Under the Datacenter>Storage menu, i click on the ZFS pool and remove NodeB from the configured nodes
I delete the ZFS Pool in NodeB
Datacenter replaces all disks
I recreate the pool (with the same name)
I add the node again to the configured nodes under Datacenter>Storage

Is that right?
 
Last edited:
ok so am i doing this the right way?:

I migrate everything to NodeB
Under the Datacenter>Storage menu, i click on the ZFS pool and remove NodeB from the configured nodes
I delete the ZFS Pool in NodeB
Datacenter replaces all disks
I recreate the pool (with the same name)
I add the node again to the configured nodes under Datacenter>Storage

Is that right?

Basically yes. But be careful. You probably mean you will move everything to NodeA. NodeA remains active while you then remove NodeB from Datacenter>Storage. So you delete the Zpool on the node where there is no longer any data/VMs.