Migration of VM on NFS fails between 2 servers

christian.hahn

New Member
Feb 23, 2024
8
1
3
Hi !
I have a 3 node cluster on Proxmox and one of the Hosts has some issues I couldn't identify till now. To have my VM save I wanted to put it onto a dedicated (external) NFS where the VM is hosted so that I can easily move the VM between by cluster. If I do the migration between the 2 "good" servers it works like expected but as soon as I migrate to/from the "bad" host I get the following in my logs and the VM gets killed:

Code:
task started by HA resource agent
2025-09-08 12:37:00 starting migration of VM 160 to node 'site1-snuc' (192.168.178.92)
2025-09-08 12:37:00 starting VM 160 on remote node 'site1-snuc'
2025-09-08 12:37:01 start remote tunnel
2025-09-08 12:37:02 ssh tunnel ver 1
2025-09-08 12:37:02 starting online/live migration on unix:/run/qemu-server/160.migrate
2025-09-08 12:37:02 set migration capabilities
2025-09-08 12:37:02 migration downtime limit: 100 ms
2025-09-08 12:37:02 migration cachesize: 2.0 GiB
2025-09-08 12:37:02 set migration parameters
2025-09-08 12:37:02 start migrate command to unix:/run/qemu-server/160.migrate
2025-09-08 12:37:03 migration active, transferred 662.7 MiB of 16.0 GiB VM-state, 745.3 MiB/s
2025-09-08 12:37:04 migration active, transferred 1.4 GiB of 16.0 GiB VM-state, 873.9 MiB/s
2025-09-08 12:37:05 migration active, transferred 2.1 GiB of 16.0 GiB VM-state, 832.6 MiB/s
2025-09-08 12:37:06 migration active, transferred 2.8 GiB of 16.0 GiB VM-state, 776.8 MiB/s
2025-09-08 12:37:07 migration active, transferred 3.5 GiB of 16.0 GiB VM-state, 5.2 GiB/s
2025-09-08 12:37:08 migration active, transferred 4.2 GiB of 16.0 GiB VM-state, 897.1 MiB/s
2025-09-08 12:37:09 migration active, transferred 4.9 GiB of 16.0 GiB VM-state, 859.3 MiB/s
2025-09-08 12:37:10 migration active, transferred 5.7 GiB of 16.0 GiB VM-state, 742.8 MiB/s
2025-09-08 12:37:10 average migration speed: 2.0 GiB/s - downtime 108 ms
2025-09-08 12:37:10 migration completed, transferred 5.9 GiB VM-state
2025-09-08 12:37:10 migration status: completed
2025-09-08 12:37:10 ERROR: tunnel replied 'ERR: resume failed - VM 160 qmp command 'query-status' failed - client closed connection' to command 'resume 160'
2025-09-08 12:37:12 ERROR: migration finished with problems (duration 00:00:12)
TASK ERROR: migration problems


does anyone have an idea what could be wrong ? I'm on latest 8.4.12 with my servers.
 
Last edited:
never mind - I found the issue.
I had to change the vm type to a vendor neutral profile using the following:
Code:
qm shutdown 160 --skiplock 1
qm stop 160 --skiplock 1

qm set 160 -cpu x86-64-v2-AES
qm set 160 -machine q35
qm set 160 -numa 0
 
  • Like
Reactions: waltar
probably a ton of impact beause you deactiavte cpu features, but you dont need to keep it that way, you can set it back to host cpu after migration
you cant ofc live migrate between different cpus, thats in its inherent nature.
but you can offline migrate

or your sure your application dont need any of the enhancements then by all means run it without.