Live migration failed

Unfortunately I'm seeing this same issue still in an proxmox 8.0 -> 8.2 live migration. (Same VM as last time).
My work around is still to just stop the IO intensive daemon inside that vm during the last fase of the migration.
FYI, a fix is still on the way and hopefully land soonish: https://lists.proxmox.com/pipermail/pve-devel/2024-July/064420.html

My original patch was rejected, because it would hurt performance too much and doing it properly without that performance penalty required upstreaming a new feature in QEMU, which took a while, and landed in QEMU 8.2: https://gitlab.com/qemu-project/qemu/-/commit/2d400d15a02dca3b7b90761b2f0bb2322e99e11a

The patch on the pve-devel list is then for making use of the feature in Proxmox VE's qemu-server management code.
 
In https://lists.proxmox.com/pipermail/pve-devel/2024-July/064982.html it says the patch is applied...

Unfortunately this is still an issue with a 8.4 -> 9.1.6 migration. The annoying part being that the vm is not running anymore after the failure.
The workaround mentioned above still works though.

Code:
all 'mirror' jobs are ready
2026-03-21 21:39:11 starting online/live migration on unix:/run/qemu-server/144.migrate
2026-03-21 21:39:11 set migration capabilities
2026-03-21 21:39:11 migration speed limit: 280.0 MiB/s
2026-03-21 21:39:11 migration downtime limit: 100 ms
2026-03-21 21:39:11 migration cachesize: 512.0 MiB
2026-03-21 21:39:11 set migration parameters
2026-03-21 21:39:11 start migrate command to unix:/run/qemu-server/144.migrate
2026-03-21 21:39:12 migration active, transferred 281.5 MiB of 4.0 GiB VM-state, 320.2 MiB/s
2026-03-21 21:39:13 migration active, transferred 562.5 MiB of 4.0 GiB VM-state, 286.1 MiB/s
2026-03-21 21:39:14 migration active, transferred 839.8 MiB of 4.0 GiB VM-state, 288.7 MiB/s
2026-03-21 21:39:15 migration active, transferred 1.1 GiB of 4.0 GiB VM-state, 139.0 MiB/s
2026-03-21 21:39:16 migration active, transferred 1.3 GiB of 4.0 GiB VM-state, 219.4 MiB/s
2026-03-21 21:39:17 migration active, transferred 1.6 GiB of 4.0 GiB VM-state, 288.7 MiB/s
2026-03-21 21:39:18 migration active, transferred 1.8 GiB of 4.0 GiB VM-state, 285.2 MiB/s
2026-03-21 21:39:19 migration active, transferred 2.1 GiB of 4.0 GiB VM-state, 294.5 MiB/s
2026-03-21 21:39:20 migration active, transferred 2.4 GiB of 4.0 GiB VM-state, 298.4 MiB/s
2026-03-21 21:39:21 migration active, transferred 2.6 GiB of 4.0 GiB VM-state, 173.0 MiB/s
2026-03-21 21:39:22 migration active, transferred 2.8 GiB of 4.0 GiB VM-state, 329.1 MiB/s
2026-03-21 21:39:23 migration active, transferred 3.1 GiB of 4.0 GiB VM-state, 290.4 MiB/s
2026-03-21 21:39:24 migration active, transferred 3.4 GiB of 4.0 GiB VM-state, 286.4 MiB/s
2026-03-21 21:39:25 migration active, transferred 3.7 GiB of 4.0 GiB VM-state, 483.9 MiB/s
2026-03-21 21:39:27 migration active, transferred 4.0 GiB of 4.0 GiB VM-state, 348.8 MiB/s
2026-03-21 21:39:27 xbzrle: send updates to 12868 pages in 10.7 MiB encoded memory, cache-miss 88.75%, overflow 126
query migrate failed: VM 144 not running

2026-03-21 21:39:27 query migrate failed: VM 144 not running
query migrate failed: VM 144 not running

2026-03-21 21:39:28 query migrate failed: VM 144 not running
query migrate failed: VM 144 not running

2026-03-21 21:39:29 query migrate failed: VM 144 not running
query migrate failed: VM 144 not running

2026-03-21 21:39:30 query migrate failed: VM 144 not running
query migrate failed: VM 144 not running

2026-03-21 21:39:31 query migrate failed: VM 144 not running
query migrate failed: VM 144 not running

2026-03-21 21:39:33 query migrate failed: VM 144 not running
2026-03-21 21:39:33 ERROR: online migrate failure - too many query migrate failures - aborting
2026-03-21 21:39:33 aborting phase 2 - cleanup resources
2026-03-21 21:39:33 migrate_cancel
2026-03-21 21:39:33 migrate_cancel error: VM 144 not running
2026-03-21 21:39:33 ERROR: query-status error: VM 144 not running
drive-scsi0: Cancelling block job
drive-scsi1: Cancelling block job
2026-03-21 21:39:33 ERROR: VM 144 not running
2026-03-21 21:39:38 ERROR: migration finished with problems (duration 00:06:58)

TASK ERROR: migration problems
 
Hi @Helmo,
what QEMU version was the VM started with (check the time of the VM Start task before the failure and check your APT history to see which version was installed at the time)?

Please check the system logs/journal to see the actual error why the VM was not running anymore. Do you see the bdrv_co_write_req_prepare assertion failure?

Please also share more of the migration task log. If you were running a recent enough version, it should tell you that the disk mirroring switched to active mode.
 
When I started the migration the originating server had 8.4.17
But when the vm was started could be a long time ago, not sure I can find that version. But something 8.x.


I did find these in the system log:

QEMU[1545695]: kvm: ../block/io.c:1960: bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
qm[3300343]: VM 144 qmp command failed - VM 144 not running

I've migrated the vm now again from 9.x to 9.x ... and that worked without issue :)