[SOLVED] live migration between prox v8 and v9 nodes fails when vm has efidisk

K-P4ul

Member
Jan 28, 2021
13
1
23
Hi,

i have a 2+1 node (2 nodes + 1 qdevice) cluster with linstor drbd as shared storage. One node is running on version 8.4.14, the other one on version 9.0.15 (freshly updated).
When i now live migrate all vms to the newer node (to upgrade the node with version 8) the migration fails for vms with an efi disk.

The error messages i get are:
Job: VM 104 - Migrate
Code:
task started by HA resource agent
2025-11-17 16:29:25 use dedicated network address for sending migration traffic (10.255.240.59)
2025-11-17 16:29:25 starting migration of VM 104 to node 'regis' (10.255.240.59)
2025-11-17 16:29:26 starting VM 104 on remote node 'regis'
2025-11-17 16:29:27 [regis] Plugin "PVE::Storage::Custom::LINSTORPlugin" is implementing an older storage API, an upgrade is recommended
2025-11-17 16:29:30 [regis] close (rename) atomic file '/etc/pve/nodes/regis/qemu-server/104.conf' failed: File exists
2025-11-17 16:29:30 ERROR: online migrate failure - remote command failed with exit code 255
2025-11-17 16:29:30 aborting phase 2 - cleanup resources
2025-11-17 16:29:30 migrate_cancel
2025-11-17 16:29:32 ERROR: migration finished with problems (duration 00:00:07)
TASK ERROR: migration problems

Job: VM 104 - Start
Code:
efidisk0: enrolling Microsoft UEFI CA 2023
INFO: reading raw edk2 varstore from /var/run/qemu-server/qsd-104-efidisk0-enroll.fuse
INFO: var store range: 0x64 -> 0x40000
INFO: add db cert /usr/lib/python3/dist-packages/virt/firmware/certs/MicrosoftCorporationUEFICA2011.pem
INFO: certificate already present, skipping
INFO: add db cert /usr/lib/python3/dist-packages/virt/firmware/certs/MicrosoftUEFICA2023.pem
INFO: certificate already present, skipping
INFO: writing raw edk2 varstore to /var/run/qemu-server/qsd-104-efidisk0-enroll.fuse

TASK ERROR: close (rename) atomic file '/etc/pve/nodes/regis/qemu-server/104.conf' failed: File exists

When I remove the efidisk from the vm live migration works flawless. The Problem is that I have clusters with more than 50 vms running. So shutting all the vms down to remove the efidisk is not an option.
 
Hi,

nearly same error here - PVE 9 to 9 - no more migration possible:

task started by HA resource agent
2025-11-18 09:32:01 conntrack state migration not supported or disabled, active connections might get dropped
2025-11-18 09:32:02 use dedicated network address for sending migration traffic
2025-11-18 09:32:02 starting migration of VM 103 to node 'pveAMD02'
2025-11-18 09:32:02 starting VM 103 on remote node 'pveAMD02'
2025-11-18 09:32:02 [pveAMD02] Plugin "PVE::Storage::Custom::LINSTORPlugin" is implementing an older storage API, an upgrade is recommended
2025-11-18 09:32:02 [pveAMD02] stat for '/dev/drbd/by-res/pm-1be83a14/0' failed - No such file or directory
2025-11-18 09:32:02 ERROR: online migrate failure - remote command failed with exit code 255
2025-11-18 09:32:02 aborting phase 2 - cleanup resources
2025-11-18 09:32:02 migrate_cancel
2025-11-18 09:32:03 ERROR: migration finished with problems (duration 00:00:02)
TASK ERROR: migration problems

I've just installed the latest updates...
 
I can confirm, pve 9.0.15 on both nodes.
TASK ERROR: efidisk0: enrolling Microsoft UEFI CA 2023 failed - command 'virt-fw-vars --inplace /var/run/qemu-server/qsd-7607-efidisk0-enroll.fuse --distro-keys ms-uefi' failed: exit code 1
 
Sorry - but still having the same issue like formerly posted.

I've checked - qemu-server is 9.0.29.

Still can't migrate live VM - offline works fine.
 
did you double check that the *target* node has the new version?

could you post the complete VM config?
 
installed on both nodes:
qemu-server/stable,now 9.0.29 amd64 [installed]

Restart for new kernel is still pending due to migration issue

VM config:
agent: 1
bios: ovmf
boot: order=scsi0
cores: 2
cpu: host
efidisk0: linstor_storage: pm-1be83a14_103,efitype=4m,size=528K
localtime: 1
memory: 4096
meta: creation-qemu=9.0.2,ctime=1734190017
name: haos14.0
net1: virtio=BC:24:11:85:36:29,bridge=vmbr0,firewall=1,tag=10
onboot: 1
ostype: l26
scsi0: linstor_storage: pm-4667fed5_103,cache=writethrough,discard=on,size=33555416K,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=f95a1329-8ef1-4aee-a8f7-49687eab0fd7
tablet: 0
 
Last edited: