Migrate cluster node from ext4 to zfs

meichthys · May 9, 2020

I have a 3 node cluster with two nodes running on ZFS and one on EXT4. I'd like to change the third node to ZFS, but that node is running my VMs/containers currently. I'm not sure of the best path to take. I've tried migrating my vms/containers to the other nodes but i only have local storage on each node and the migration option doesn't seem to be working. (maybe it requires shared storage or zfs?) I have backups of all the VMs so my last resort would be to take backups of all the vms/containers onto an external usb drive an then import those backups onto one of the other nodes, start them up, and then remove the vms/containers from the node i'd like to update to zfs.

Could someone provide some guidance as to the cleanest way to do this?

Alwin · May 14, 2020

Best upgrade to the latest (Proxmox VE 6.2). There you can migrate with local storage on the GUI as well (otherwise CLI).
https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_migration

meichthys · May 14, 2020

@Alwin Thanks, for the link. I was upgraded to 6.2 and was having issues with live migrations as well as offline migrations. The migration would start and would hang after it said it found the local disk that was to be migrated. I let it sit for hours/days with no progress being made on even the smallest VMs. I could not seem to track down any logs indicating why it was hung. `dmesg` didn't show anything relevant, so i chalked it up to having different file systems on the two nodes (ext4 & zfs) and trying to use local storage for the migration instead of shared storage. I manually took a backup of the VMs and manually imported to the other node with no issues. After i upgraded i still had issues with migrations on local storage but through the help of some guys on the irc channel i put together the following procedure for a manual method of bringing up a vm on a different node using replication & manual fail-over:

To perform a manual fail-over (requires functioning replication which in turn requires zfs):

Make sure the source VM is off (If the source node is failed, you can skip this step)
Perform a manual replication to make sure any data is not lost (If the source node is failed, you can skip this step)
On the backup proxmox node, do:
- mv /etc/pve/nodes/<failed_node>/qemu-server/<vm_id.conf> /etc/pve/nodes/<backup_node>/qemu-server/
- The VM should now show up under the backup proxmox node in the gui
In the gui, on the backup proxmox node, start the VM.

Note: After a manual fail-over, replication will be setup to replicate back to the ‘failed-node’. This seems to cause issues with the replication on the original proxmox host (these replication jobs seem to stop or get hung). This isn't a huge issue for me since i primarily use my 'backup' node as a backup and nothing else.

EDIT: I just realized i was not actually on 6.2, i was using 6.1! I had updated just a day or two ago and assumed i was on the latest version! ‍(facepalm). I haven't tested it but it sounds like migration for local storage was added in 6.2!

Alwin · May 14, 2020

It sounds like an I/O bottleneck on migrating the VM. With the storage replication the migration should produce less load and succeed in a timely manner.

Search

Search

Migrate cluster node from ext4 to zfs

meichthys

Well-Known Member

Alwin

Proxmox Retired Staff

meichthys

Well-Known Member

Alwin

Proxmox Retired Staff

We value your privacy