Live migration fails at final stage (PVE 9.1.4 + Ceph)

tiboo86 · Feb 6, 2026

I am experiencing a recurring issue with online (live) migration of a VM in a Proxmox VE cluster.
The migration starts normally and progresses as expected for most of the process, but it consistently fails during the final completion phase, even after Proxmox automatically increases the allowed downtime.

Below is a summary of the behavior and the relevant logs.

Environment:

Proxmox VE 9.1.4
Ceph shared storage
Online / live migration
VM with 16 GiB RAM
Dedicated migration network (high throughput observed)

Observed behavior:

Live migration starts correctly.
Memory state transfer progresses normally up to about 14.9 GiB / 16.0 GiB.
Transfer rates are variable but generally high (peaks close to 900 MiB/s).
Near the end of the migration, the transfer stalls at 14.9 GiB with 0.0 B/s throughput.
Proxmox automatically increases the allowed downtime multiple times:
100 ms → 200 ms → 400 ms → … → up to 204800 ms
Despite this, the migration never completes.

Final error:

Code:

migration status error: failed - Error in migration completion: Bad address
ERROR: online migrate failure - aborting
ERROR: migration finished with problems

Additional message seen at the beginning:

Code:

conntrack state migration not supported or disabled, active connections might get dropped

(The migration continues despite this warning.)

Result:

Migration aborts during phase 2 (finalization).
Cleanup is triggered.
The VM remains on the source node.
The issue is reproducible for this VM.

Open questions:

Is this a known issue in PVE 9.x / QEMU related to the final memory synchronization phase?
Could this be related to Ceph, the migration network, or kernel-level networking (conntrack)?
Are there recommended tunings or workarounds (migration cache size, downtime limits, precopy/postcopy, disabling conntrack, etc.)?

Thx

dcsapak · Feb 6, 2026

Hi,

could you post the vm config and 'pveversion -v' from both sides please
also the journal from the time of the migration from both sides could be relevant

tiboo86 · Feb 6, 2026

Node source :

Code:

root@agorapverssi1:~# pveversion -v
proxmox-ve: 9.1.0 (running kernel: 6.17.4-2-pve)
pve-manager: 9.1.4 (running version: 9.1.4/5ac30304265fbd8e)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.4-2-pve-signed: 6.17.4-2
proxmox-kernel-6.17: 6.17.4-2
amd64-microcode: 3.20250311.1
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.4
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.4
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.1-1
proxmox-backup-file-restore: 4.1.1-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.5
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.1.0
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-5
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.3
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1

Node destination :

Code:

root@ccvpverssi2:~# pveversion -v
proxmox-ve: 9.1.0 (running kernel: 6.17.4-2-pve)
pve-manager: 9.1.4 (running version: 9.1.4/5ac30304265fbd8e)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.4-2-pve-signed: 6.17.4-2
proxmox-kernel-6.17: 6.17.4-2
amd64-microcode: 3.20250311.1
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.4
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.4
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.1-1
proxmox-backup-file-restore: 4.1.1-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.5
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.1.0
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-5
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.3
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1
root@ccvpverssi2:~#

Journal of task migration is in attached file.
Thx

tiboo86 · Feb 6, 2026

Sorry, i have missed the conf of the VM :

Code:

root@agorapverssi1:/etc/pve/qemu-server# cat 111.conf
#Debian 12 Packer Template - 20250109-1035
agent: 1
bios: seabios
boot: order=scsi0;ide2;net0
cores: 4
cpu: x86-64-v2-AES
hotplug: disk,network,usb
ide0: RSSI_CEPH:vm-111-cloudinit,media=cdrom,size=4M
ide2: none,media=cdrom
ipconfig0: ip=10.89.20.16/24,gw=10.89.20.254
kvm: 1
machine: pc-q35-9.2+pve1
memory: 16384
meta: creation-qemu=9.0.2,ctime=1736418784
name: POISRPELKE06
net0: virtio=BC:24:11:6E:E4:4B,bridge=RSSI
numa: 0
onboot: 1
ostype: l26
scsi0: RSSI_CEPH:vm-111-disk-0,cache=writeback,discard=on,iothread=1,size=32G,ssd=1
scsi1: RSSI_CEPH:vm-111-disk-1,cache=writeback,iothread=1,size=10000G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=363314e0-3750-47db-a3c0-ec4761d211ae
sockets: 2
tags: elk
vmgenid: e85419a2-3ee9-44e1-98ad-184af8b02093
root@agorapverssi1:/etc/pve/qemu-server#

dcsapak · Feb 6, 2026

tiboo86 said:
Journal of task migration is in attached file.

actually i didn't mean the task log, but the whole journal/syslog from both nodes. you can obtain that with

Code:

journalctl

(this will print the *whole* journal, use '--until' and '--since' to limit it to the correct timeframe)

tiboo86 · Feb 6, 2026

ok its better ?

dcsapak · Feb 6, 2026

yes, thanks, i can see the folloing messages:

Code:

 2026-02-06T13:41:26+01:00 agorapverssi1 QEMU[13459]: kvm: migration_block_inactivate: bdrv_inactivate_all() failed: -1
 2026-02-06T13:41:26+01:00 agorapverssi1 QEMU[13459]: kvm: Error in migration completion: Bad address

and

Code:

2026-02-06T13:41:26+01:00 agorapverssi2 QEMU[834719]: kvm: load of migration failed: Input/output error

(the latter one is probably just the symptom of the failed migration)

the first one indicates that qemu cannot close the disk image properly, do you have any issues regarding the storage ?(i see you use ceph, and the volume id would indicating that the disks are on ceph)
did you enable krbd ? any special configs on your ceph cluster?
(in the journal there is IMO nothing abnormal about ceph)

tiboo86 · Feb 6, 2026

dcsapak said:
yes, thanks, i can see the folloing messages:

Code:

2026-02-06T13:41:26+01:00 agorapverssi1 QEMU[13459]: kvm: migration_block_inactivate: bdrv_inactivate_all() failed: -1 2026-02-06T13:41:26+01:00 agorapverssi1 QEMU[13459]: kvm: Error in migration completion: Bad address

and

Code:

2026-02-06T13:41:26+01:00 agorapverssi2 QEMU[834719]: kvm: load of migration failed: Input/output error

(the latter one is probably just the symptom of the failed migration)

the first one indicates that qemu cannot close the disk image properly, do you have any issues regarding the storage ?(i see you use ceph, and the volume id would indicating that the disks are on ceph)
did you enable krbd ? any special configs on your ceph cluster?
(in the journal there is IMO nothing abnormal about ceph)

Hello Dominik,

Indeed, we are using Ceph, but we don’t see anything abnormal on the cluster side. We also don’t experience any issues with other VMs, including migrations in general.

From our perspective, it looks more like a VM-specific issue. The affected VM is heavily loaded in terms of memory usage, and during the migration process, the final memory switchover may fail because the operating system keeps continuously loading data into RAM.

Could this continuous memory activity prevent the migration from completing properly and lead to this kind of error?

For sure, KRBD is not loaded, nothing return for this command : lsmod | grep rbd

Ceph details :

Code:

root@agorapverssi1:~# ceph status
  cluster:
    id:     d9d05da3-93c7-419d-8437-a3a97f466330
    health: HEALTH_OK
 
  services:
    mon: 5 daemons, quorum agorapverssi1,ccvpverssi2,agorapverssi2,ccvpverssi1,prcpverssi1 (age 22h)
    mgr: agorapverssi1(active, since 22h), standbys: agorapverssi2, ccvpverssi2, ccvpverssi1
    osd: 44 osds: 44 up (since 22h), 44 in (since 22h)
 
  data:
    pools:   3 pools, 1281 pgs
    objects: 16.78M objects, 64 TiB
    usage:   298 TiB used, 182 TiB / 480 TiB avail
    pgs:     1189 active+clean
             55   active+clean+scrubbing
             37   active+clean+scrubbing+deep
 
  io:
    client:   56 MiB/s rd, 40 MiB/s wr, 89 op/s rd, 633 op/s wr

Code:

root@agorapverssi1:~# ceph config dump
WHO     MASK  LEVEL     OPTION                                 VALUE            RO
mon           advanced  auth_allow_insecure_global_id_reclaim  false             
mgr           advanced  mgr/prometheus/server_addr             10.1.1.121       *
mgr           advanced  mgr/prometheus/server_port             9090               
osd           advanced  osd_recovery_sleep                     0.000000           
osd.*         advanced  osd_mclock_profile                     high_client_ops   
osd.0         basic     osd_mclock_max_capacity_iops_hdd       478.223971         
osd.1         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.10        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.11        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.12        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.13        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.14        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.15        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.16        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.17        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.18        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.19        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.2         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.20        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.21        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.22        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.23        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.24        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.25        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.26        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.27        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.28        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.3         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.30        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.31        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.32        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.33        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.34        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.35        basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.36        basic     osd_mclock_max_capacity_iops_ssd       39113.405486       
osd.37        basic     osd_mclock_max_capacity_iops_ssd       34995.008720       
osd.38        basic     osd_mclock_max_capacity_iops_ssd       57385.773856       
osd.39        basic     osd_mclock_max_capacity_iops_ssd       37092.485265       
osd.4         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.40        basic     osd_mclock_max_capacity_iops_ssd       45456.575159       
osd.41        basic     osd_mclock_max_capacity_iops_ssd       40238.894022       
osd.42        basic     osd_mclock_max_capacity_iops_ssd       40758.208652       
osd.43        basic     osd_mclock_max_capacity_iops_ssd       32269.102284       
osd.5         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.6         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.7         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.8         basic     osd_mclock_max_capacity_iops_hdd       450.000000         
osd.9         basic     osd_mclock_max_capacity_iops_hdd       450.000000

davis.k · Apr 22, 2026

Having the same issue with two of our production VMs also on Ceph storage. Discovered this when I was unable to expand a disk on one of them and got the following error
TASK ERROR: VM 105 qmp command 'block_resize' failed - Node 'drive-scsi0' is busy: block device is in use by block job: backup

Figured I'd test migrating, when i received the error mentioned in this thread.

2026-04-22 13:03:19 migration status error: failed - Error in migration completion: Bad address
2026-04-22 13:03:19 ERROR: online migrate failure - aborting
2026-04-22 13:03:19 aborting phase 2 - cleanup resources
2026-04-22 13:03:19 migrate_cancel

We using Veeam Agent Backup for this cluster, so backup process is happening within the VMs OS.

Code:

proxmox-ve: 9.1.0 (running kernel: 6.17.4-2-pve)
pve-manager: 9.1.4 (running version: 9.1.4/5ac30304265fbd8e)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.17.4-2-pve-signed: 6.17.4-2
proxmox-kernel-6.17: 6.17.4-2
proxmox-kernel-6.8: 6.8.12-16
proxmox-kernel-6.8.12-16-pve-signed: 6.8.12-16
proxmox-kernel-6.8.12-15-pve-signed: 6.8.12-15
proxmox-kernel-6.8.12-13-pve-signed: 6.8.12-13
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.4.1-1+pve1
ifupdown2: 3.3.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.5
libpve-apiclient-perl: 3.4.2
libpve-cluster-api-perl: 9.0.7
libpve-cluster-perl: 9.0.7
libpve-common-perl: 9.1.4
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.2.4
libpve-rs-perl: 0.11.4
libpve-storage-perl: 9.1.0
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-3
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.1.1-1
proxmox-backup-file-restore: 4.1.1-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.1
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.3
proxmox-widget-toolkit: 5.1.5
pve-cluster: 9.0.7
pve-container: 6.0.18
pve-docs: 9.1.2
pve-edk2-firmware: 4.2025.05-2
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.4
pve-firmware: 3.17-2
pve-ha-manager: 5.1.0
pve-i18n: 3.6.6
pve-qemu-kvm: 10.1.2-5
pve-xtermjs: 5.5.0-3
qemu-server: 9.1.3
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve3
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1

fiona · Apr 27, 2026

Hi @davis.k,
could you share the full migration task log and excerpt from the system logs/journal for completeness? If there is a left-over backup block job from Veeam, then yes, that might hinder migration.

davis.k · Apr 27, 2026

I ended up restarting all cluster hosts and by extension the VMs which were stuck on the hosts. The problem has not returned since the restart on Wednesday.

Code:

Apr 22 13:28:15 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:16 NF-61D-PVE-3 ceph-mgr[2038]: ::ffff:10.5.224.4 - - [22/Apr/2026:13:28:16] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.53.3"
Apr 22 13:28:17 NF-61D-PVE-3 rsyslogd[1682]: cannot resolve hostname 'NF-S61D5E-009.i04.local' [v8.2504.0 try https://www.rsyslog.com/e/2027 ]
Apr 22 13:28:20 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:24 NF-61D-PVE-3 sshd-session[446708]: Accepted password for root from 10.5.32.8 port 63894 ssh2
Apr 22 13:28:24 NF-61D-PVE-3 sshd-session[446708]: pam_unix(sshd:session): session opened for user root(uid=0) by root(uid=0)
Apr 22 13:28:24 NF-61D-PVE-3 systemd-logind[1688]: New session 89841 of user root.
Apr 22 13:28:24 NF-61D-PVE-3 systemd[1]: Started session-89841.scope - Session 89841 of User root.
Apr 22 13:28:25 NF-61D-PVE-3 sshd-session[446716]: Received disconnect from 10.5.32.8 port 63894:11: Connection terminated by the client.
Apr 22 13:28:25 NF-61D-PVE-3 sshd-session[446716]: Disconnected from user root 10.5.32.8 port 63894
Apr 22 13:28:25 NF-61D-PVE-3 sshd-session[446708]: pam_unix(sshd:session): session closed for user root
Apr 22 13:28:25 NF-61D-PVE-3 systemd[1]: session-89841.scope: Deactivated successfully.
Apr 22 13:28:25 NF-61D-PVE-3 systemd-logind[1688]: Session 89841 logged out. Waiting for processes to exit.
Apr 22 13:28:25 NF-61D-PVE-3 systemd-logind[1688]: Removed session 89841.
Apr 22 13:28:25 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:27 NF-61D-PVE-3 pveproxy[182444]: worker exit
Apr 22 13:28:27 NF-61D-PVE-3 pveproxy[5283]: worker 182444 finished
Apr 22 13:28:27 NF-61D-PVE-3 pveproxy[5283]: starting 1 worker(s)
Apr 22 13:28:27 NF-61D-PVE-3 pveproxy[5283]: worker 446750 started
Apr 22 13:28:30 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:31 NF-61D-PVE-3 ceph-mgr[2038]: ::ffff:10.5.224.4 - - [22/Apr/2026:13:28:31] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.53.3"
Apr 22 13:28:35 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:40 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:45 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:46 NF-61D-PVE-3 ceph-mgr[2038]: ::ffff:10.5.224.4 - - [22/Apr/2026:13:28:46] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.53.3"
Apr 22 13:28:50 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:28:55 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:00 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:01 NF-61D-PVE-3 ceph-mgr[2038]: ::ffff:10.5.224.4 - - [22/Apr/2026:13:29:01] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.53.3"
Apr 22 13:29:05 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:10 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:15 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:16 NF-61D-PVE-3 ceph-mgr[2038]: ::ffff:10.5.224.4 - - [22/Apr/2026:13:29:16] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.53.3"
Apr 22 13:29:20 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:25 NF-61D-PVE-3 pve-ha-lrm[445625]: Task 'UPID:NF-61D-PVE-3:0006CCBA:2405A8DD:69E885E6:qmigrate:104:root@pam:' still active, waiting
Apr 22 13:29:26 NF-61D-PVE-3 QEMU[7140]: kvm: migration_block_inactivate: bdrv_inactivate_all() failed: -1
Apr 22 13:29:26 NF-61D-PVE-3 QEMU[7140]: kvm: Error in migration completion: Bad address

Code:

task started by HA resource agent
2026-04-22 13:25:10 conntrack state migration not supported or disabled, active connections might get dropped
2026-04-22 13:25:11 starting migration of VM 104 to node 'NF-61D-PVE-1' (10.57.204.2)
2026-04-22 13:25:11 starting VM 104 on remote node 'NF-61D-PVE-1'
2026-04-22 13:25:14 start remote tunnel
2026-04-22 13:25:15 ssh tunnel ver 1
2026-04-22 13:25:15 starting online/live migration on unix:/run/qemu-server/104.migrate
2026-04-22 13:25:15 set migration capabilities
2026-04-22 13:25:15 migration downtime limit: 100 ms
2026-04-22 13:25:15 migration cachesize: 4.0 GiB
2026-04-22 13:25:15 set migration parameters
2026-04-22 13:25:15 start migrate command to unix:/run/qemu-server/104.migrate
2026-04-22 13:25:16 migration active, transferred 109.4 MiB of 32.0 GiB VM-state, 117.0 MiB/s
...
2026-04-22 13:29:21 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:21 auto-increased downtime to continue migration: 200 ms
2026-04-22 13:29:22 auto-increased downtime to continue migration: 400 ms
2026-04-22 13:29:22 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:22 auto-increased downtime to continue migration: 800 ms
2026-04-22 13:29:23 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:23 auto-increased downtime to continue migration: 1600 ms
2026-04-22 13:29:24 auto-increased downtime to continue migration: 3200 ms
2026-04-22 13:29:24 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:24 auto-increased downtime to continue migration: 6400 ms
2026-04-22 13:29:25 auto-increased downtime to continue migration: 12800 ms
2026-04-22 13:29:25 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:26 auto-increased downtime to continue migration: 25600 ms
2026-04-22 13:29:26 migration active, transferred 25.5 GiB of 32.0 GiB VM-state, 0.0 B/s
2026-04-22 13:29:26 auto-increased downtime to continue migration: 51200 ms
2026-04-22 13:29:26 migration status error: failed - Error in migration completion: Bad address
2026-04-22 13:29:26 ERROR: online migrate failure - aborting
2026-04-22 13:29:26 aborting phase 2 - cleanup resources
2026-04-22 13:29:26 migrate_cancel
2026-04-22 13:29:29 ERROR: migration finished with problems (duration 00:04:19)
TASK ERROR: migration problems

fiona · 2026-05-04T15:31:42+0200

For VMs with temporary EFI disks, I was able to reproduce the issue. Using a proper EFI disk is recommended. Proposed patches: https://lore.proxmox.com/pve-devel/20260504130751.226845-1-f.ebner@proxmox.com/T/

Search

Search

Live migration fails at final stage (PVE 9.1.4 + Ceph)

tiboo86

New Member

dcsapak

Proxmox Staff Member

tiboo86

New Member

Attachments

tiboo86

New Member

dcsapak

Proxmox Staff Member

tiboo86

New Member

Attachments

dcsapak

Proxmox Staff Member

tiboo86

New Member

davis.k

New Member

fiona

Proxmox Staff Member

davis.k

New Member

fiona

Proxmox Staff Member

We value your privacy