Nvidia vGPU mdev and live migration

lclements0 · Mar 14, 2022

Long time Proxmox user, but first time graphics card virtualizer.

I've got a couple of Nvidia V100S 32G cards split between two servers. Nvidia drivers installed as per instructions on the host, and the vGPU itself works fine on the VM with GRID drivers. Licensing also works, as expected.

As there are a number of components involved here, I can imagine that the answer is probably rather complex, but is Nvidia vGPU live migration possible between two matching proxmox nodes with a mediated device? Am I missing some documentation that demonstrates how to live migrate a VM with a mediated device attached? When I attempt to do a migration, it errors out as there is a hostpci device attached, which I assume is expected.

Any help would be appreciated!

dcsapak · Mar 15, 2022

no, currently that is not possible. aside from qemu upstream work that is not done yet, i think the hardware must also be capable of this.
the reason is that on live migration you need to carry the internal state of the hardware, which is trivial for virtualized hw, but for 'real' hw saving/restoring that state must be implemented somehow

what should be possible though is ha recovery, since that simply starts a new vm on the target node instead of live migrating (you did not ask for this really, but i wanted to clarify, since sometimes live-migration and ha-recovery gets mixed up)

lclements0 · Mar 16, 2022

Thanks for the reply, @dcsapak. Appreciate knowing where things stand. My understanding is that the NVidia cards, at least the enterprise ones, do support some form of mediated device live migration. Both VMware and XenServer carry some form of live migration support, though certainly that entails some additional work on the KVM side that I'm completely unfamiliar with.

When you say HA-Recovery, I assume you mean assigning the VM to a target HA group that contains servers that have these cards in them?

It appears as though virtualizing cards is becoming more and more popular, based on post volume in these forums alone. Is live migration for nVidia cards something that's even on the roadmap at this point to look at what would be required, or has this been ignored for the time being?

Thanks again.

dcsapak · Mar 17, 2022

lclements0 said:
Both VMware and XenServer carry some form of live migration support,

ok AFAICS, nvidia has some sort of support for that on vmware/xen so that hardware must support those things? (though i could not determine which cards exactly? all of them?)
on the qemu side there is some implementation, but it's still marked as experimental.

sadly there are currently only those nvidia cards that support those, and we don't have any here. (also the licensing scheme is very expensive IMHO)
hopefully intels new discrete gpus this year bring some competition to this market, but who knows...

lclements0 said:
When you say HA-Recovery, I assume you mean assigning the VM to a target HA group that contains servers that have these cards in them?

yes i mean assigning such a vm to a ha group and when a host is fenced, it gets restarted on another one

lclements0 said:
It appears as though virtualizing cards is becoming more and more popular, based on post volume in these forums alone. Is live migration for nVidia cards something that's even on the roadmap at this point to look at what would be required, or has this been ignored for the time being?

until recently (qemus first rough implementation was end of 2020 begin of 2021) this was not possible ... and having only nvidia and their licensing scheme does not make it easier to
test this... but yes, this is something we're generally interested in and it seems that there is some movement there on qemu side, so we'll check it out sooner or later

spirit · Jun 13, 2022

Hi,

I have a contact doing live migration with nvidia passthrough && ovirt.

I'm currently looking with him, I think it's using mdev too.

Looking at qemu-devel, it's seem that their are flag to enable migration on vfio (not sure if it's already enabled by default)
https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg02364.html

also, here a nice tutorial to get mdev working the consumer nvidia card too

https://wvthoog.nl/proxmox-7-vgpu-v2/

I'll try to have more info this week about libvirt config and the setup to be sure.

spirit · Jun 17, 2022

I have done test on a cluster, I can't get it working , even with x-enable-migration=on

if you want to test:

edit
/usr/share/perl5/PVE/QemuServer/PCI.pm

line 475:

Code:

            if ($sysfspath) {
                $devicestr .= ",x-enable-migration=on,sysfsdev=$sysfspath";

comment in
/usr/share/perl5/PVE/QemuMigrate.pm

Code:

#    my $loc_res = PVE::QemuServer::check_local_resources($conf, 1);
#    if (scalar @$loc_res) {
#       if ($self->{running} || !$self->{opts}->{force}) {
#           die "can't migrate VM which uses local devices: " . join(", ", @$loc_res) . "\n";
#       } else {
#           $self->log('info', "migrating VM which uses local devices");
#       }
 #   }

on both proxmox servers, and
systemctl restart pvedaemon

then do the migration:

qm migrate <vmid> <targetnode> --online

....

2022-06-17 16:37:26 migrate uri => unix:/run/qemu-server/102.migrate failed: VM 102 qmp command 'migrate' failed - VFIO device doesn't support migration

I have lookded at qemu code, the error message can't be multiple reason, not only the missing flag on qemu.
kernel vfio migration with dirty need to be implement (it seem to be ok since kernel 5.9)

So, maybe is it an nvidia driver bug, I really don't known. (I have tested with quadro with vgpu unlock hack)

spirit · Aug 16, 2022

Hi, I finally got it working ! Main problem was an missing build option in nvidia vfio kernel driver to enable live migration.

Capture d’écran du 2022-08-16 23-41-32.png

dcsapak · Aug 22, 2022

spirit said:
Hi, I finally got it working ! Main problem was an missing build option in nvidia vfio kernel driver to enable live migration.

great to see.. which option was missing? (i don't seem to find an obvious one when i check the installer '--help' and '-A' pages...)

spirit · Aug 24, 2022

I have already reply on the pve-devel mailing, but for forum users, if you want to enable vfio migration in nvidia driver

"
I have used 460.73.01 driver. (last 510 driver don't have the flag and
code, don't known why)
https://github.com/mbilker/vgpu_unlock-rs/issues/15

the flag is NV_KVM_MIGRATION_UAP=1.
As I didn't known to pass the flag,

I have simply decompress the driver
"NVIDIA-Linux-x86_64-460.73.01-grid-vgpu-kvm-v5.run -x"
edit the "kernel/nvidia-vgpu-vfio/nvidia-vgpu-vfio.Kbuild" to add
NV_KVM_MIGRATION_UAP=1

then ./nvidia-installer
"

dcsapak · Aug 25, 2022

spirit said:
I have used 460.73.01 driver. (last 510 driver don't have the flag and
code, don't known why)
https://github.com/mbilker/vgpu_unlock-rs/issues/15

the flag is NV_KVM_MIGRATION_UAP=1.

just fyi, the flag still seems to exist on the second newest drivers (what nvidia calls 'grid 14.1' which corresponds to 510.73)

also one can edit it after installation in the dkms source folder in /usr/src/nvidia-<version>/

spirit · Aug 25, 2022

dcsapak said:
just fyi, the flag still seems to exist on the second newest drivers (what nvidia calls 'grid 14.1' which corresponds to 510.73)

also one can edit it after installation in the dkms source folder in /usr/src/nvidia-<version>/

I'm not sure, but maybe simply add

"vgpu_dev->migration_enabled = NV_TRUE;" in the code should be enough ?

the "if defined(NV_KVM_MIGRATION_UAPI)" only add this.

also,looking at x86_64-510.47.03-vgpu-kvm, the flag option is still present.

dcsapak · Aug 25, 2022

spirit said:
"vgpu_dev->migration_enabled = NV_TRUE;" in the code should be enough ?

i don't think its enough, there are more migration related parts gated with the _START_PFN macro (which was previously gated only by a MIGRATION_INFO_PRESENT macro)

spirit · Aug 25, 2022

dcsapak said:
i don't think its enough, there are more migration related parts gated with the _START_PFN macro (which was previously gated only by a MIGRATION_INFO_PRESENT macro)

yes, that's not work. and START_PFN is really for old kernel. (it's looking in kernel source about presence of a specific vfio structure).

Last driver I have found with UAPI is 510.73.06

https://github.com/VGPU-Community-D....1/NVIDIA-Linux-x86_64-510.73.06-vgpu-kvm.run

driver 510.85.03 seem to miss it.(maybe is it just a bug, and they have forget to add it ?)

https://github.com/VGPU-Community-D....2/NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run

qicheng · Sep 20, 2023

spirit said:
yes, that's not work. and START_PFN is really for old kernel. (it's looking in kernel source about presence of a specific vfio structure).

Last driver I have found with UAPI is 510.73.06

https://github.com/VGPU-Community-D....1/NVIDIA-Linux-x86_64-510.73.06-vgpu-kvm.run

driver 510.85.03 seem to miss it.(maybe is it just a bug, and they have forget to add it ?)

https://github.com/VGPU-Community-D....2/NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run

hi, did you finally achieve gpu migration? Gpu can work well on dest vm? I saw a discussion from nvidia that qemu migrating Gpu seems requires a kernel-specific kvm.
https://forums.developer.nvidia.com...gration-support-for-vfio-devices-patch/197820

spirit · Sep 20, 2023

qicheng said:
hi, did you finally achieve gpu migration? Gpu can work well on dest vm? I saw a discussion from nvidia that qemu migrating Gpu seems requires a kernel-specific kvm.
https://forums.developer.nvidia.com...gration-support-for-vfio-devices-patch/197820

yes, I was able to do it last year with 5.15 kernel and a specify nvidia driver version. (and patched for mdev support, because nvidia is locking the drivers without license).

I'll try to rework on it for proxmox8 but I'm not sure that nvidia support kernel 6.2 yet.

I'll integrate it to the new device mapping

qicheng · Sep 20, 2023

spirit said:
yes, I was able to do it last year with 5.15 kernel and a specify nvidia driver version. (and patched for mdev support, because nvidia is locking the drivers without license).

I'll try to rework on it for proxmox8 but I'm not sure that nvidia support kernel 6.2 yet.

I'll integrate it to the new device mapping

thanks your reply.If I want to support gpu migrate in qemu-6.2.0 && kvm ,What work I need to do?
1.Find the right linux kernel(such as 5.15)? Now my kernel is 5.10.0-60.18.0.50, but how do I know if the kernel version supports it or not.
2.Find specify nvidia driver version As said above.
3.Add patched for mdev support ? the patch you said is "qemu-5-0-unsupport-vgpu-migration-after-applying-add-migration-support-for-vfio-devices-patch"?
4.And then nvidia is looking, use license can unlock it ？ And What else do I need to do such as config "NV_KVM_MIGRATION_UAP" and rebuild?
Do you have anything to add? thanks a lot!!!

spirit · Sep 20, 2023

qicheng said:
thanks your reply.If I want to support gpu migrate in qemu-6.2.0 && kvm ,What work I need to do?
1.Find the right linux kernel(such as 5.15)? Now my kernel is 5.10.0-60.18.0.50, but how do I know if the kernel version supports it or not.
2.Find specify nvidia driver version As said above.
3.Add patched for mdev support ? the patch you said is "qemu-5-0-unsupport-vgpu-migration-after-applying-add-migration-support-for-vfio-devices-patch"?
4.And then nvidia is looking, use license can unlock it ？ And What else do I need to do such as config "NV_KVM_MIGRATION_UAP" and rebuild?
Do you have anything to add? thanks a lot!!!

1-3: try to follow this guide
https://gitlab.com/polloloco/vgpu-proxmox

to unlock it add add mdev support

for 4, yes it need to be rebuild with NV_KVM_MIGRATION_UAP=1 in the makefile. (Note that this flag was changing last year, nvidia was doing changes between version, adding/removing it, rename it .... )

Then, you need to add a specific migration flag to qemu command line, see my comment to hack current proxmox code:
https://forum.proxmox.com/threads/nvidia-vgpu-mdev-and-live-migration.106480/

qicheng · Sep 20, 2023

spirit said:
1-3: try to follow this guide
https://gitlab.com/polloloco/vgpu-proxmox

to unlock it add add mdev support

for 4, yes it need to be rebuild with NV_KVM_MIGRATION_UAP=1 in the makefile. (Note that this flag was changing last year, nvidia was doing changes between version, adding/removing it, rename it .... )

Then, you need to add a specific migration flag to qemu command line, see my comment to hack current proxmox code:
https://forum.proxmox.com/threads/nvidia-vgpu-mdev-and-live-migration.106480/

ok, thanks very much, i go to try it

dcsapak · Sep 20, 2023

@spirit AFAICS with the v16 drivers from nvidia and kernel 6.2 there should not be any changes to the drivers necessary (did not get around to testing this yet though) at least the driver does not print the 'disable migration' notice anymore...

spirit · Sep 20, 2023

dcsapak said:
@spirit AFAICS with the v16 drivers from nvidia and kernel 6.2 there should not be any changes to the drivers necessary (did not get around to testing this yet though) at least the driver does not print the 'disable migration' notice anymore...

oh great

I known somebody with a cluster with telsa cards for testing. I'll try next month to do test again.

@badji ping. Time to test gpu migration again ^_^

Nvidia vGPU mdev and live migration

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Distinguished Member

Distinguished Member

Distinguished Member

Proxmox Staff Member

Distinguished Member

Proxmox Staff Member

Distinguished Member

Proxmox Staff Member

Distinguished Member

New Member

Distinguished Member

New Member

Distinguished Member

New Member

Proxmox Staff Member

Distinguished Member