[SOLVED] Detect live migration from inside the guest VM

irg

New Member
Jul 9, 2024
5
0
1
Hello.
Does anyone know if there's any solution that would allow guest VM to detect that he is being migrated WITHOUT implying the host node?

Context: I have an VM that runs on Proxmox VE. I want to be able to detect, from inside guest VM, that he is being migrated in order to perform some actions if this happens.
I don't have access to install or to do something on host beside the default configurations offered by Proxmox's GUI. I was thinking the maybe at least the host(physical) node name would be accessible from guest, but I couldn't find anything about this.

Thank you!
 
As bbgeek points out that would be a real breach. However I'm thinking that maybe the guest upon detection of the freeze/thaw commands issued on live-migration - and could take some action based on this. However except for live-migration those commands maybe issued on a backup/snapshot.
 
Depending on how far the "do something (with) the default configurations offered by the proxmox's GUI" stretch, there might be a way that I know of that should work (not tested though since I've never needed it):

Create a new network bridge on proxmox without any physical network adaptors attached on each of the host, each with their own IP (and outside of one of the networks already in-use).
Add a second network-card on the VM, also assign it an IP within that range (and no default gateway of course)
Now just have a script ping the IP of each of the different hosts, whatever IP you get a response from, you're on at the moment.
For example:
node 1, vmbr99, 10.255.255.1/24
node 2, vmbr99, 10.255.255.2/24
node 3, vmbr99, 10.255.255.3/24
VM, vmbr99, 10.255.255.100/24

When a VM is moved, it should take the same bridge-number, but since it only has a "local" network access, the change can be detected, and it can also be re-used by other VM's, or even for VM's to know if they are "together" on a host.

This of course isn't a "direct" method and more of a work-around, but it would technically full-fill the "detect on which host you are" requirement you set.
 
Last edited:
  • Like
Reactions: UdoB
The easiest method I know and works for me is to ping all hypervisor IPs and detect where you're running by the ping times. The "local" node, so the one you're running on, has the lowest ping times. It's marginal, yet statistically measureable 0.12ms vs. 0.15ms
 
The easiest method I know and works for me is to ping all hypervisor IPs and detect where you're running by the ping times. The "local" node, so the one you're running on, has the lowest ping times. It's marginal, yet statistically measureable 0.12ms vs. 0.15ms
Since this is relying on timing though, in a 3-cluster setup, if there is a big data-transfer happening between 1 (where the VM is) and 2, and node 3 is idle, node 3 could (at some points during the transfer) have a lower ping-response time, since 1/2 are just busy handling other network-packages.

That's of course why you'd take it over a longer period of time, but removing the option for false-positives (even if it is a couple more steps) is usually better, depending on what you do based on the information.

But yes, it would require even less configuring, but in networks where "hardware"/management and "VM's are seperated, this of course wouldn't work at all.

One alternative method (which would need extra software installed/configured so not usuable in this case) would be through SNMP-Monitoring, and looking for on which server the tap<VM-ID>i0 exists and/or what server your VM's mac-address is in the listed devices is on.
 
Thank you very much guys for all your answers and help!
Now I realize that maybe I was too "harsh" with my description.

I have no intention to do some security breach. I was thinking that Proxmox VE (or KVM in general) might offer a way to pass the host node name (for example) to the guest with some option in VM's configuration or by default through qemu-guest-agent.

Just as an example, how Hyper-V offers through hv_kvp_daemon, the pool files. From there I can see the Physical Node name and detect the node name change.
 
Since this is relying on timing though, in a 3-cluster setup, if there is a big data-transfer happening between 1 (where the VM is) and 2, and node 3 is idle, node 3 could (at some points during the transfer) have a lower ping-response time, since 1/2 are just busy handling other network-packages.

That's of course why you'd take it over a longer period of time, but removing the option for false-positives (even if it is a couple more steps) is usually better, depending on what you do based on the information.

But yes, it would require even less configuring, but in networks where "hardware"/management and "VM's are seperated, this of course wouldn't work at all.
Yes, you're fully right.

One alternative method (which would need extra software installed/configured so not usuable in this case) would be through SNMP-Monitoring, and looking for on which server the tap<VM-ID>i0 exists and/or what server your VM's mac-address is in the listed devices is on.
A monitor user in PVE itself can give you the information via the PVE API.
 
Context: I have an VM that runs on Proxmox VE. I want to be able to detect, from inside guest VM, that he is being migrated in order to perform some actions if this happens.
What would those actions be? I cannot think of any problem where I need to know where a VM runs. All nodes are normally equivalent and the whole point of virtualization is that you actually don't care where it runs.
 
Totally forgot about the API (Still fairly new to ProxMox), but yeah that would work too.
And indeed I do fail to see a usecase on equivalent hosts, if they have different setups/CPU's though then I can partly understand it ("Hey, you're on an older CPU, please stop this CPU-intensive process for now till you're migrated off it again so other VM's aren't bottlenecked")
 
What would those actions be? I cannot think of any problem where I need to know where a VM runs. All nodes are normally equivalent and the whole point of virtualization is that you actually don't care where it runs.
Basically I want to issue some alert messages if this happens.
I have an "sensitive" process that runs inside VM. If this process is not scheduled for more than 500ms, it will restart itself. From tests, I saw that the VM might be "frozen" for more than 500ms.
So detecting if there was an live migration (or even a snapshot consolidation) would help to figure out why the process was restarted.
 
Basically I want to issue some alert messages if this happens.
I have an "sensitive" process that runs inside VM. If this process is not scheduled for more than 500ms, it will restart itself. From tests, I saw that the VM might be "frozen" for more than 500ms.
So detecting if there was an live migration (or even a snapshot consolidation) would help to figure out why the process was restarted.
You could create a pre-start hookscript [1] for that VM on start that checks the PVE_MIGRATED_FROM envvar [2] and sends an alert if it is set.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_hookscripts
[2] https://git.proxmox.com/?p=qemu-ser...8e3472dc7b07083babe41911ad0f45f;hb=HEAD#l5718
 
Hi,
You could create a pre-start hookscript [1] for that VM on start that checks the PVE_MIGRATED_FROM envvar [2] and sends an alert if it is set.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_hookscripts
[2] https://git.proxmox.com/?p=qemu-ser...8e3472dc7b07083babe41911ad0f45f;hb=HEAD#l5718
beside what @gfngfn256 pointed out, this is not exactly as requested, because the hook is executed at the start of live migration, when the VM instance is started in paused state on the target, i.e. the guest is not running on the target node yet, still on the source node. The VM instance is resumed on the target node after the migration finished, and that's when the guest is running on the target node.

Would require the post-migration hook I think.
 
  • Like
Reactions: shanreich
I have an "sensitive" process that runs inside VM. If this process is not scheduled for more than 500ms, it will restart itself. From tests, I saw that the VM might be "frozen" for more than 500ms.
Shouldn't that raise the steal time counter in top and similar commands in the VM? I mean, if the host doesn't let it run it's processes because the host is either overloaded or artificially limiting the amount of CPU available to the VM (i.e. using CPU Limit), the steal time in the VM should raise over 0.
 
As bbgeek points out that would be a real breach. However I'm thinking that maybe the guest upon detection of the freeze/thaw commands issued on live-migration - and could take some action based on this. However except for live-migration those commands maybe issued on a backup/snapshot.
I managed to test this ideea today.
But it seems that fsfreeze / fsthaw command are not issued on live-migration (or at least in my case it didn't), only when snaphost is taken (backup not tested).
I also tried to enable "Run guest-trim after a disk move or VM migration" option from QEMU guest agent menu but this also wasn't executed.
 
I managed to test this ideea today.
But it seems that fsfreeze / fsthaw command are not issued on live-migration (or at least in my case it didn't), only when snaphost is taken (backup not tested).
It's not needed for live migration, because the full VM state including pending IO requests to disks/filesystem state in RAM etc. is migrated to the target.
I also tried to enable "Run guest-trim after a disk move or VM migration" option from QEMU guest agent menu but this also wasn't executed.
Was there a disk on a local storage being migrated? For disks on shared storages, trimming after migration would be as useful as trimming before migration, because it's the same disk, so the allocation state doesn't change and the command is not executed.
 
  • Like
Reactions: gfngfn256
It's not needed for live migration, because the full VM state including pending IO requests to disks/filesystem state in RAM etc. is migrated to the target.

Was there a disk on a local storage being migrated? For disks on shared storages, trimming after migration would be as useful as trimming before migration, because it's the same disk, so the allocation state doesn't change and the command is not executed.
Now I understand. Thank you for the explanation.
And also thanks to all of you for the help!

So, the conclusion is that there is not a direct way to detect live migration only from whitin the guest VM.
I will mark this as SOLVED and go ahead with the ideea to monitor the CPU steal time.
Thank you once again!
 
I said, try lldpctl in VM. If it's enabled on undergoing device, you can see some info about it. But it will detect change after migration done.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!