[SOLVED] Cannot Migrate VM to other node (ZFS replicated)

Nov 24, 2020
20
12
8
Austria
Hi!

I cannot migrate a VM to another node, because of: unable to parse value of 'efidisk0' - format error and online migrate failure - number of replicated disks on source and target node do not match - target node too old?

I tried to remove and readded the replication, but no success. The replication is working (green icon). I am using all the latest pve-enterprise updates.

FULL migration LOG:

Code:
task started by HA resource agent
2021-11-03 13:19:55 starting migration of VM 105 to node 'node2'
2021-11-03 13:19:55 found local, replicated disk 'zfs:vm-105-disk-0' (in current VM config)
2021-11-03 13:19:55 found local, replicated disk 'zfs:vm-105-disk-1' (in current VM config)
2021-11-03 13:19:55 scsi0: start tracking writes using block-dirty-bitmap 'repl_scsi0'
2021-11-03 13:19:55 efidisk0: start tracking writes using block-dirty-bitmap 'repl_efidisk0'
2021-11-03 13:19:55 replicating disk images
2021-11-03 13:19:55 start replication job
2021-11-03 13:19:55 guest => VM 105, running => 1015376
2021-11-03 13:19:55 volumes => zfs:vm-105-disk-0,zfs:vm-105-disk-1
2021-11-03 13:19:55 freeze guest filesystem
2021-11-03 13:20:02 create snapshot '__replicate_105-0_1635941995__' on zfs:vm-105-disk-0
2021-11-03 13:20:02 create snapshot '__replicate_105-0_1635941995__' on zfs:vm-105-disk-1
2021-11-03 13:20:02 thaw guest filesystem
2021-11-03 13:20:02 using secure transmission, rate limit: none
2021-11-03 13:20:02 incremental sync 'zfs:vm-105-disk-0' (__replicate_105-0_1635941940__ => __replicate_105-0_1635941995__)
2021-11-03 13:20:03 send from @__replicate_105-0_1635941940__ to rpool/data/vm-105-disk-0@__replicate_105-0_1635941995__ estimated size is 9.42M
2021-11-03 13:20:03 total estimated size is 9.42M
2021-11-03 13:20:03 successfully imported 'zfs:vm-105-disk-0'
2021-11-03 13:20:03 incremental sync 'zfs:vm-105-disk-1' (__replicate_105-0_1635941940__ => __replicate_105-0_1635941995__)
2021-11-03 13:20:04 send from @__replicate_105-0_1635941940__ to rpool/data/vm-105-disk-1@__replicate_105-0_1635941995__ estimated size is 624B
2021-11-03 13:20:04 total estimated size is 624B
2021-11-03 13:20:04 successfully imported 'zfs:vm-105-disk-1'
2021-11-03 13:20:04 delete previous replication snapshot '__replicate_105-0_1635941940__' on zfs:vm-105-disk-0
2021-11-03 13:20:04 delete previous replication snapshot '__replicate_105-0_1635941940__' on zfs:vm-105-disk-1
2021-11-03 13:20:04 (remote_finalize_local_job) delete stale replication snapshot '__replicate_105-0_1635941940__' on zfs:vm-105-disk-0
2021-11-03 13:20:04 (remote_finalize_local_job) delete stale replication snapshot '__replicate_105-0_1635941940__' on zfs:vm-105-disk-1
2021-11-03 13:20:04 end replication job
2021-11-03 13:20:04 starting VM 105 on remote node 'node2'
2021-11-03 13:20:05 [node2] vm 105 - unable to parse value of 'efidisk0' - format error
2021-11-03 13:20:05 [node2] efitype: property is not defined in schema and the schema does not allow additional properties
2021-11-03 13:20:06 [node2] vm 105 - unable to parse value of 'efidisk0' - format error
2021-11-03 13:20:06 [node2] efitype: property is not defined in schema and the schema does not allow additional properties
2021-11-03 13:20:06 volume 'zfs:vm-105-disk-0' is 'zfs:vm-105-disk-0' on the target
2021-11-03 13:20:06 ERROR: online migrate failure - number of replicated disks on source and target node do not match - target node too old?
2021-11-03 13:20:06 aborting phase 2 - cleanup resources
2021-11-03 13:20:06 migrate_cancel
2021-11-03 13:20:06 efidisk0: removing block-dirty-bitmap 'repl_efidisk0'
2021-11-03 13:20:06 scsi0: removing block-dirty-bitmap 'repl_scsi0'
2021-11-03 13:20:06 ERROR: migration finished with problems (duration 00:00:12)
TASK ERROR: migration problems



Code:
root@node1:~# cat /etc/pve/qemu-server/105.conf
agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0
cores: 4
efidisk0: zfs:vm-105-disk-1,efitype=4m,size=528K
hotplug: disk,network,usb
machine: pc-q35-6.0
memory: 6144
name: cust1
net0: virtio=9E:5A:60:3B:FB:D0,bridge=vmbr0,firewall=1
net1: virtio=EA:F0:EA:8B:9C:59,bridge=vnet1,firewall=1,tag=10
numa: 0
onboot: 1
ostype: win10
scsi0: zfs:vm-105-disk-0,discard=on,size=128G
scsihw: virtio-scsi-pci
smbios1: uuid=f46cc7d8-9539-400a-ae78-0e71f23bbe7e
sockets: 1
vmgenid: 4304982c-34d1-4e51-ae61-b7353db96cc9
 
the package versions on the target node are older than on the source node - this is not guaranteed to work reliably in all cases. upgrade the target node, then it should work.
 
Ok, this is strange:

both nodes are up to date with the enterprise repos (apt-get update && apt-get dist-upgrade -> nothin to do).

However, the versions differ a little bit on both nodes (only different packets are shown)

node1
Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-5-pve)
pve-manager: 7.0-13 (running version: 7.0-13/7aa7e488)
pve-edk2-firmware: 3.20210831-1
qemu-server: 7.0-16

node2
Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-5-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-edk2-firmware: 3.20200531-1
qemu-server: 7.0-14

I restarted node2, but its still runnind the old packages, how can I get the latest packages? Of course I have subscriptions for both nodes.
 
node1 must have either a non-enterprise repo enabled, or have been installed/upgraded recently using a non-enterprise repo. neither pve-manager 7.0-13 nor qemu-server 7.0-16 are on pve-enterprise yet.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!