Migrate cluster node from ext4 to zfs

meichthys

Well-Known Member
Sep 25, 2019
35
6
48
34
I have a 3 node cluster with two nodes running on ZFS and one on EXT4. I'd like to change the third node to ZFS, but that node is running my VMs/containers currently. I'm not sure of the best path to take. I've tried migrating my vms/containers to the other nodes but i only have local storage on each node and the migration option doesn't seem to be working. (maybe it requires shared storage or zfs?) I have backups of all the VMs so my last resort would be to take backups of all the vms/containers onto an external usb drive an then import those backups onto one of the other nodes, start them up, and then remove the vms/containers from the node i'd like to update to zfs.

Could someone provide some guidance as to the cleanest way to do this?
 
@Alwin Thanks, for the link. I was upgraded to 6.2 and was having issues with live migrations as well as offline migrations. The migration would start and would hang after it said it found the local disk that was to be migrated. I let it sit for hours/days with no progress being made on even the smallest VMs. I could not seem to track down any logs indicating why it was hung. `dmesg` didn't show anything relevant, so i chalked it up to having different file systems on the two nodes (ext4 & zfs) and trying to use local storage for the migration instead of shared storage. I manually took a backup of the VMs and manually imported to the other node with no issues. After i upgraded i still had issues with migrations on local storage but through the help of some guys on the irc channel i put together the following procedure for a manual method of bringing up a vm on a different node using replication & manual fail-over:

To perform a manual fail-over (requires functioning replication which in turn requires zfs):
  1. Make sure the source VM is off (If the source node is failed, you can skip this step)
  2. Perform a manual replication to make sure any data is not lost (If the source node is failed, you can skip this step)
  3. On the backup proxmox node, do:
    • mv /etc/pve/nodes/<failed_node>/qemu-server/<vm_id.conf> /etc/pve/nodes/<backup_node>/qemu-server/
    • The VM should now show up under the backup proxmox node in the gui
  4. In the gui, on the backup proxmox node, start the VM.
Note: After a manual fail-over, replication will be setup to replicate back to the ‘failed-node’. This seems to cause issues with the replication on the original proxmox host (these replication jobs seem to stop or get hung). This isn't a huge issue for me since i primarily use my 'backup' node as a backup and nothing else.

EDIT: I just realized i was not actually on 6.2, i was using 6.1! I had updated just a day or two ago and assumed i was on the latest version! ‍(facepalm). I haven't tested it but it sounds like migration for local storage was added in 6.2!
 
Last edited:
It sounds like an I/O bottleneck on migrating the VM. With the storage replication the migration should produce less load and succeed in a timely manner.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!