VM Migration Failure

redflag_420

New Member
Sep 17, 2024
1
0
1
Hello, I upgraded from Proxmox 7.4 to 8.2.4. After the upgrade, when I migrate powered on VMs between servers, it won't power on/resume after the migration. Otherwise, the migration completes. I can't seem to find anything in any log that I'm aware of that'll help narrow down the issue. The VM will power on perfectly fine after the migration is complete, but I have to manually power it on. If I power off the VM first, then migration, there is no issues at all. This is what the log shows...

2024-09-17 03:17:49 migration status: completed
all 'mirror' jobs are ready
drive-efidisk0: Completing block job...
drive-efidisk0: Completed successfully.
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-efidisk0: mirror-job finished
drive-scsi0: mirror-job finished
2024-09-17 03:17:51 stopping NBD storage migration server on target.
2024-09-17 03:17:53 ERROR: tunnel replied 'ERR: resume failed - VM 119 not running' to command 'resume 119'
Logical volume "vm-119-disk-0" successfully removed.
Logical volume "vm-119-disk-1" successfully removed.
2024-09-17 03:18:23 ERROR: migration finished with problems (duration 06:36:13)
TASK ERROR: migration problems

Any idea what it could be? Any help is greatly appreciated.
 
I have the exact same problem, but unless OP i did not recently upgrade but have this problem all the time when i try to live migrate a VM to another host.

If this is relevant: This VM has 2 Volumes, one the boot volume stored on local-zfs and another stored on a NFS share mapped into proxmox and then mapped as another drive into the VM.

2024-09-21 22:06:25 starting migration of VM 119 to node 'Luxtra' (mailto:root@ADDRESS)
2024-09-21 22:06:25 found local, replicated disk 'local-zfs:vm-119-disk-0' (attached)
2024-09-21 22:06:25 scsi0: start tracking writes using block-dirty-bitmap 'repl_scsi0'
2024-09-21 22:06:25 replicating disk images
2024-09-21 22:06:25 start replication job
2024-09-21 22:06:25 guest => VM 119, running => 679885
2024-09-21 22:06:25 volumes => local-zfs:vm-119-disk-0
2024-09-21 22:06:26 freeze guest filesystem
2024-09-21 22:06:26 create snapshot '__replicate_119-0_1726949185__' on local-zfs:vm-119-disk-0
2024-09-21 22:06:26 thaw guest filesystem
2024-09-21 22:06:26 using secure transmission, rate limit: none
2024-09-21 22:06:26 incremental sync 'local-zfs:vm-119-disk-0' (__replicate_119-0_1726948811__ => __replicate_119-0_1726949185__)
2024-09-21 22:06:27 send from @__replicate_119-0_1726948811__ to rpool/data/vm-119-disk-0@__replicate_119-0_1726949185__ estimated size is 42.0M
2024-09-21 22:06:27 total estimated size is 42.0M
2024-09-21 22:06:27 TIME SENT SNAPSHOT rpool/data/vm-119-disk-0@__replicate_119-0_1726949185__
2024-09-21 22:06:27 successfully imported 'local-zfs:vm-119-disk-0'
2024-09-21 22:06:27 delete previous replication snapshot '__replicate_119-0_1726948811__' on local-zfs:vm-119-disk-0
2024-09-21 22:06:28 (remote_finalize_local_job) delete stale replication snapshot '__replicate_119-0_1726948811__' on local-zfs:vm-119-disk-0
2024-09-21 22:06:28 end replication job
2024-09-21 22:06:28 starting VM 119 on remote node 'Luxtra'
2024-09-21 22:06:29 volume 'local-zfs:vm-119-disk-0' is 'local-zfs:vm-119-disk-0' on the target
2024-09-21 22:06:30 start remote tunnel
2024-09-21 22:06:30 ssh tunnel ver 1
2024-09-21 22:06:30 starting storage migration
2024-09-21 22:06:30 scsi0: start migration to nbd:unix:/run/qemu-server/119_nbd.migrate:exportname=drive-scsi0
drive mirror re-using dirty bitmap 'repl_scsi0'
drive mirror is starting for drive-scsi0
drive-scsi0: transferred 384.0 KiB of 1.8 MiB (20.69%) in 0s
drive-scsi0: transferred 1.8 MiB of 1.8 MiB (100.00%) in 1s, ready
all 'mirror' jobs are ready
2024-09-21 22:06:31 switching mirror jobs to actively synced mode
drive-scsi0: switching to actively synced mode
drive-scsi0: successfully switched to actively synced mode
2024-09-21 22:06:32 starting online/live migration on unix:/run/qemu-server/119.migrate
2024-09-21 22:06:32 set migration capabilities
2024-09-21 22:06:32 migration downtime limit: 100 ms
2024-09-21 22:06:32 migration cachesize: 1.0 GiB
2024-09-21 22:06:32 set migration parameters
2024-09-21 22:06:32 start migrate command to unix:/run/qemu-server/119.migrate
2024-09-21 22:06:33 migration active, transferred 244.8 MiB of 8.0 GiB VM-state, 11.1 GiB/s
2024-09-21 22:06:34 migration active, transferred 524.0 MiB of 8.0 GiB VM-state, 376.3 MiB/s
2024-09-21 22:06:35 migration active, transferred 799.3 MiB of 8.0 GiB VM-state, 288.5 MiB/s
2024-09-21 22:06:36 migration active, transferred 1.1 GiB of 8.0 GiB VM-state, 292.5 MiB/s
2024-09-21 22:06:37 migration active, transferred 1.3 GiB of 8.0 GiB VM-state, 283.4 MiB/s
2024-09-21 22:06:38 migration active, transferred 1.6 GiB of 8.0 GiB VM-state, 300.1 MiB/s
2024-09-21 22:06:39 migration active, transferred 1.9 GiB of 8.0 GiB VM-state, 295.7 MiB/s
2024-09-21 22:06:40 migration active, transferred 2.1 GiB of 8.0 GiB VM-state, 290.5 MiB/s
2024-09-21 22:06:41 migration active, transferred 2.4 GiB of 8.0 GiB VM-state, 297.2 MiB/s
2024-09-21 22:06:42 migration active, transferred 2.7 GiB of 8.0 GiB VM-state, 295.3 MiB/s
2024-09-21 22:06:43 migration active, transferred 3.0 GiB of 8.0 GiB VM-state, 283.4 MiB/s
2024-09-21 22:06:44 migration active, transferred 3.2 GiB of 8.0 GiB VM-state, 288.2 MiB/s
2024-09-21 22:06:45 migration active, transferred 3.5 GiB of 8.0 GiB VM-state, 285.1 MiB/s
2024-09-21 22:06:46 migration active, transferred 3.8 GiB of 8.0 GiB VM-state, 290.5 MiB/s
2024-09-21 22:06:47 migration active, transferred 4.0 GiB of 8.0 GiB VM-state, 297.2 MiB/s
2024-09-21 22:06:48 migration active, transferred 4.3 GiB of 8.0 GiB VM-state, 280.7 MiB/s
2024-09-21 22:06:49 migration active, transferred 4.6 GiB of 8.0 GiB VM-state, 278.6 MiB/s
2024-09-21 22:06:50 migration active, transferred 4.9 GiB of 8.0 GiB VM-state, 276.3 MiB/s
2024-09-21 22:06:51 average migration speed: 432.0 MiB/s - downtime 119 ms
2024-09-21 22:06:51 migration status: completed
all 'mirror' jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0: mirror-job finished
2024-09-21 22:06:53 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=Luxtra' -o 'UserKnownHostsFile=/etc/pve/nodes/Luxtra/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@ADDRESS pvesr set-state 119 \''{"local/xerneas":{"fail_count":0,"duration":2.806288,"last_node":"xerneas","last_iteration":1726949185,"last_try":1726949185,"last_sync":1726949185,"storeid_list":["local-zfs"]}}'\'
2024-09-21 22:06:54 stopping NBD storage migration server on target.
2024-09-21 22:06:54 ERROR: tunnel replied 'ERR: resume failed - VM 119 not running' to command 'resume 119'
2024-09-21 22:06:57 ERROR: migration finished with problems (duration 00:00:32)
TASK ERROR: migration problems
 
Last edited:
Okay, i think i found the issue, both VMs were Set to CPU Type "host". Which tbh makes total sense to fail since hot swapping CPUs is not really a thing xD
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!