can't migrate vm unless i'm logged into that exact node

jwsl224

Member
Apr 6, 2024
56
2
8
i've this weird issue with live migrating where pve can't do it unless i log into exactly the source node when doing the migration; connecting to any other node in this 5-node cluster makes the migration fail. using 'qm migrate' also fails; it can only be done via gui. this is so weird i don't even know where to start looking.

firstly, the issue seems to center around pve thinking the vm disks are "attached" storage. they're clearly not, they're regular zfs volumes like everything else.

when connected to the gui via any other node, here is what the migration window says:
1749422777669.png
vm is of course powered on. this remains, and the 'migrate' button remains greyed out, even after i select an appropriate target storage. the target storge for target nodes do populate correctly.
here is the error in detail when i try to start the migration via shell using qm migrate:
1749422884848.png

when i connect to the gui via the source node, i can use the gui to migrate the vm, but not qm migrate. any ideas?
 
Hi,
qm migrate needs to be issued on the source node, that is by design. And if your VM has local disks, you need to specify the --with-local-disks flag like the log tells you. To issue the migration call to a VM on another node you can do e.g.
pvesh create /nodes/pve8a2/qemu/106/migrate --target pve8a1 --online 1 --with-local-disks 1

As for why the UI is behaving differently, that sounds like the status of the VM is not detected correctly. Do you see any errors when you check the Network tab in your browsers developer tools (often Ctrl+Shift+C). Is your pvestatd service functioning properly on all nodes? Do you see any interesting messages in the system logs/journal of the relevant nodes?
 
qm migrate needs to be issued on the source node, that is by design.
certainly. this is what i am speaking of. what i meant is connecting to the cluster has to be done via the source node. it should technically work to connect to the cluster from any node, and then issue the command by connecting to the source node shell, right?

And if your VM has local disks, you need to specify the --with-local-disks flag like the log tells you.
the vm has no local disks. that's what is confusing me. it just has regular zfs virtual disks like all the rest of the vms. and again, the migration works fine when connecting to the cluster using the IP address of the source node. and only that way.


Do you see any errors when you check the Network tab in your browsers developer tools (often Ctrl+Shift+C).
hmm. wow. never seen that tab before :p but no i don't see any errors. the "status" column shows a list of green "200". lots of activity though. evidently clustering is hard work :)

Is your pvestatd service functioning properly on all nodes?
yes. it is "running" on relevant nodes.

Do you see any interesting messages in the system logs/journal of the relevant nodes?
i do not see anything interesting in journalctl. is there a specific something you'd like me to filter for?
 
certainly. this is what i am speaking of. what i meant is connecting to the cluster has to be done via the source node. it should technically work to connect to the cluster from any node, and then issue the command by connecting to the source node shell, right?
Yes, if you issue the qm migrate command on the node where the VM currently is, it should work.
the vm has no local disks. that's what is confusing me. it just has regular zfs virtual disks like all the rest of the vms. and again, the migration works fine when connecting to the cluster using the IP address of the source node. and only that way.
ZFS virtual disks are local disks. ZFS storage is not shared (except in the ZFS over iSCSI case): https://pve.proxmox.com/pve-docs/chapter-pvesm.html#_storage_types

Please share the VM configuration qm config 100 and storage configuration /etc/pve/storage.cfg as well as the output of pveversion -v on both nodes.
 
this is the only node i am having this issue with. and only when not connecting to it's ip address for the web gui, as mentioned.
Can you ping and ssh between the node and the other node in both directions?

What do you get when you run the following command on both the node itself and on the other node: pvesh get /nodes/pve8a2/qemu/106/migrate --output-format json-pretty (replacing the node name and VM ID with yours for which the error in the UI occurs)