Hello,
Have had an issue with one, single, live migration of a VM. This VM has been live migrated a few times before without issues, both from and to this same host. Many other VMs live migrate without issues (we've done 1000+ live migrations in this cluster already). My googling hasn't returned any meaningful results. Posting here looking from some clues as to what may have happened or if it is a known bug of this somewhat outdated version.
The host uses PVE 9.0.10 (pending maintenance window to upgrade to 9.1.x):
The task log shows the usual messages:
Journal shows the real cause:
VM configuration is quite standard and runs Windows Server 2025:
I've also checked network configuration, journalctl in source PVE host, etc but haven't found anything relevant.
Thanks in advance!
Have had an issue with one, single, live migration of a VM. This VM has been live migrated a few times before without issues, both from and to this same host. Many other VMs live migrate without issues (we've done 1000+ live migrations in this cluster already). My googling hasn't returned any meaningful results. Posting here looking from some clues as to what may have happened or if it is a known bug of this somewhat outdated version.
The host uses PVE 9.0.10 (pending maintenance window to upgrade to 9.1.x):
Code:
pveversion -v
proxmox-ve: 9.0.0 (running kernel: 6.14.11-2-pve)
pve-manager: 9.0.10 (running version: 9.0.10/deb1ca707ec72a89)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.14.11-2-pve-signed: 6.14.11-2
proxmox-kernel-6.14: 6.14.11-2
proxmox-kernel-6.14.8-2-pve-signed: 6.14.8-2
amd64-microcode: 3.20250311.1
ceph: 19.2.3-pve1
ceph-fuse: 19.2.3-pve1
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.3.1-1+pve4
ifupdown2: 3.3.0-1+pmx10
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.3
libpve-apiclient-perl: 3.4.0
libpve-cluster-api-perl: 9.0.6
libpve-cluster-perl: 9.0.6
libpve-common-perl: 9.0.10
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.4
libpve-network-perl: 1.1.8
libpve-rs-perl: 0.10.10
libpve-storage-perl: 9.0.13
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-1
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.0.14-1
proxmox-backup-file-restore: 4.0.14-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.1.2
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.1
proxmox-widget-toolkit: 5.0.5
pve-cluster: 9.0.6
pve-container: 6.0.12
pve-docs: 9.0.8
pve-edk2-firmware: 4.2025.02-4
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.3
pve-firmware: 3.16-4
pve-ha-manager: 5.0.4
pve-i18n: 3.6.0
pve-qemu-kvm: 10.0.2-4
pve-xtermjs: 5.5.0-2
qemu-server: 9.0.22
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve2
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1
The task log shows the usual messages:
Code:
[...]
2026-01-22 10:03:47 migration active, transferred 9.3 GiB of 64.0 GiB VM-state, 2.0 GiB/s
2026-01-22 10:03:48 migration active, transferred 10.4 GiB of 64.0 GiB VM-state, 1.2 GiB/s
2026-01-22 10:03:49 migration active, transferred 11.6 GiB of 64.0 GiB VM-state, 1.2 GiB/s
2026-01-22 10:03:50 migration active, transferred 13.3 GiB of 64.0 GiB VM-state, 1.2 GiB/s
2026-01-22 10:03:50 average migration speed: 4.3 GiB/s - downtime 95 ms
2026-01-22 10:03:50 migration completed, transferred 13.4 GiB VM-state
2026-01-22 10:03:50 migration status: completed
2026-01-22 10:03:50 ERROR: tunnel replied 'ERR: resume failed - VM 116 qmp command 'query-status' failed - client closed connection' to command 'resume 116'
VM quit/powerdown failed - terminating now with SIGTERM
2026-01-22 10:04:04 ERROR: migration finished with problems (duration 00:00:32)
TASK ERROR: migration problems
Journal shows the real cause:
Code:
Jan 22 10:03:50 pve02 QEMU[43236]: kvm: Unknown savevm section or instance 'dbus-vmstate/dbus-vmstate' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
Jan 22 10:03:50 pve02 QEMU[43236]: kvm: load of migration failed: Invalid argument
Jan 22 10:03:50 pve02 kernel: tap116i0: left allmulticast mode
Jan 22 10:03:50 pve02 kernel: vmbr0: port 20(tap116i0) entered disabled state
Jan 22 10:03:50 pve02 systemd[1]: 116.scope: Deactivated successfully.
Jan 22 10:03:50 pve02 systemd[1]: 116.scope: Consumed 4.701s CPU time, 14.2G memory peak.
Jan 22 10:03:50 pve02 qm[43332]: VM 116 qmp command failed - VM 116 qmp command 'query-status' failed - client closed connection
VM configuration is quite standard and runs Windows Server 2025:
Code:
agent: 1,fstrim_cloned_disks=1
bios: ovmf
boot: order=scsi0;ide0;net0
cores: 4
cpu: x86-64-v4
description: [REDACTED]
efidisk0: ceph--VMs:vm-116-disk-0,efitype=4m,pre-enrolled-keys=1,size=528K
ide0: none,media=cdrom
machine: pc-q35-10.0+pve1
memory: 65536
meta: creation-qemu=10.0.2,ctime=1768388851
name: [REDACTED]
net0: virtio=BC:24:11:63:49:04,bridge=vmbr0,tag=1005
numa: 1
onboot: 1
ostype: win11
scsi0: ceph--VMs:vm-116-disk-1,discard=on,iothread=1,size=200G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=[REDACTED]
sockets: 2
tpmstate0: ceph--VMs:vm-116-disk-2,size=4M,version=v2.0
vmgenid: [REDACTED]
I've also checked network configuration, journalctl in source PVE host, etc but haven't found anything relevant.
Thanks in advance!