[SOLVED] Can't migrate running vm with mapped resources

MisterDeeds

Active Member
Nov 11, 2021
153
34
33
35
Dear all

I have two Proxmox servers, each with an identical GPU. For this I have created an entry under the Resource Mappings.
1698315498313.png

I then passed this mapping to a VM as PCI passthrough. However, the VM cannot be migrated in online mode and must be switched off.
1698315561140.png

Is this generally not possible or am I doing something wrong?

Thank you and best regards
 
I have two Proxmox servers, each with an identical GPU. For this I have created an entry under the Resource Mappings.

I then passed this mapping to a VM as PCI passthrough. However, the VM cannot be migrated in online mode and must be switched off.

Is this generally not possible or am I doing something wrong?
I'm almost sure that migration of a running VM with PCIe passthrough is not possible, as KVM/QEMU would need to save the GPU state and recreate it on the other node (highly device specific) or transfer PCIe command between nodes (high unlikely).
 
  • Like
Reactions: MisterDeeds
I'm almost sure that migration of a running VM with PCIe passthrough is not possible, as KVM/QEMU would need to save the GPU state and recreate it on the other node (highly device specific) or transfer PCIe command between nodes (high unlikely).
Ohh, all right. Too bad but thank you for the explanation! BR
 
Hello, is there a possibility for these type of cases where PCIe passtrough is being utilized that pve could shutrdown the VM first and then migrate it to the other identical host? this for example in a scenario were the hosting node goes down unexpectedly. Thanks
 
How certain are you that the other host is 100% identical? Exact same devices in every PCIe slot? Same firmware versions? (I can kinda guarantee the S/N's are different but that is hopefully never an issue.)

You can certainly come close even if PVE won't directly do it:
  • Shut down VM
  • Edit the VM *.conf file to comment out the mapped device
  • Migrate to new host
  • Edit to uncomment (ie activate)
  • Off you go ;)
 
Exact same devices in every PCIe slot?
That problem can be solved with mapped devices, if the same devices are used. No need to edit the VM configuration.
Same firmware versions? (I can kinda guarantee the S/N's are different but that is hopefully never an issue.)
If the VM is shut down and restarted on another node (with mapped devices), and the device are identical or similar enough, then it should work in principle as no active state needs to be transferred.
 
That problem can be solved with mapped devices, if the same devices are used. No need to edit the VM configuration.
We're talking PCIe Pass Through here. The VM configuration includes the exact PCIe slot. Here's what the *.conf line it looks like for an nVidia GPU... with the final .* removed so both the GPU and audio devices are passed through (required for the nVidia driver/installer to recognize the card):
hostpci0: 0000:01:00,pcie=1
 
We're talking PCIe Pass Through here.
I think the thread here is talking about mapped PCIe. By doing this, the passthroughed device is configured in the .conf file as hostpci0: mapping=map-name,pcie=1

I'm almost sure that migration of a running VM with PCIe passthrough is not possible
So basically the workaround is to migrate it only when the VM is offline. I can do this manually of course, but for VM configured by HA, they will always do live migration. Is there a way I can configure HA for this VM to be offline migration then?

EDIT: submitted a request related to this https://bugzilla.proxmox.com/show_bug.cgi?id=6253
 
Last edited:
  • Like
Reactions: Hyacin
I think the thread here is talking about mapped PCIe. By doing this, the passthroughed device is configured in the .conf file as hostpci0: mapping=map-name,pcie=1


So basically the workaround is to migrate it only when the VM is offline. I can do this manually of course, but for VM configured by HA, they will always do live migration. Is there a way I can configure HA for this VM to be offline migration then?

EDIT: submitted a request related to this https://bugzilla.proxmox.com/show_bug.cgi?id=6253

Just found out about mapped resources today and switched a bunch of things over to functions on my network cards (used to just use them for VMs that were anchored to a box) - I too would love to see this working better with HA ... as it seems to still be a trade-off between HA and hardware virtualization, the same as it is when I'm using devices directly instead of mapped :-(
 
  • Like
Reactions: akhdanfadh