Nvidia vGPU mdev and live migration

Hello to all

Did anybody managed to enable vfio live migration in 535.161.05 driver?

I have tried to place both the old (NV_KVM_MIGRATION_UAPI=1) and the new flag (NV_VFIO_DEVICE_MIG_STATE_PRESENT=1) in the following files before install and dkms build

:: kernel/nvidia-vgpu-vfio/nvidia-vgpu-vfio.Kbuild
:: kernel/conftest.sh

but with no luck.

As well, I was reading that in /etc/vgpu_unlock/config.toml there should be a value of unlock_migration = true placed.

I'm using 5.15.149-1-pve kernel and prox 7.x version togther with Tesla P4 cards.


Thanks,
Alex
 
Hello to all

Did anybody managed to enable vfio live migration in 535.161.05 driver?

I have tried to place both the old (NV_KVM_MIGRATION_UAPI=1) and the new flag (NV_VFIO_DEVICE_MIG_STATE_PRESENT=1) in the following files before install and dkms build

:: kernel/nvidia-vgpu-vfio/nvidia-vgpu-vfio.Kbuild
:: kernel/conftest.sh

but with no luck.

As well, I was reading that in /etc/vgpu_unlock/config.toml there should be a value of unlock_migration = true placed.

I'm using 5.15.149-1-pve kernel and prox 7.x version togther with Tesla P4 cards.


Thanks,
Alex
i don't think that will work, since AFAIR the right kernel api was introduced with a later kernel

you could try with the 6.2 opt-in kernel for pve 7.x
 
Thanks for the tip, I'll give it a try and post back later this week the outcome.

Any other clues on what prior versions to 535.161.05 might do the migration trick before swaping the 5.15 kernel branch?
 
Any other clues on what prior versions to 535.161.05 might do the migration trick before swaping the 5.15 kernel branch?
sorry no clue, i just think i remember that the required kernel changes are not in 5.15 yet, only in later kernel versions
 
Hello,

@dcsapak - Thank you for the tip on switching the kernel, I set-on the 6.2.11-2-pve kernel, rebuild the 535.161.05 with dkms driver, applied the unlock patch and got back on testing.

I could see after reboot that dmesg shows:

Code:
[nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000008888: vGPU migration enabled with upstream V2 migration protocol

Still, I am under the impression that I am missing out something. Getting back to @spirit's initial testing and posts on this thread I applied the suggested changes on both /usr/share/perl5/PVE/QemuServer/PCI.pm and /usr/share/perl5/PVE/QemuMigrate.pm but I am still bumping into:

Code:
migrate uri => unix:/run/qemu-server/8888.migrate failed: VM 8888 qmp command 'migrate' failed - VFIO device doesn't support migration

Code:
2024-05-03 09:29:41 use dedicated network address for sending migration traffic (172.21.12.5)
2024-05-03 09:29:41 starting migration of VM 8888 to node 'testpve02' (172.21.12.5)
2024-05-03 09:29:41 starting VM 8888 on remote node 'testpve02'
2024-05-03 09:29:44 [testpve02] kvm: -device vfio-pci,x-enable-migration=on,sysfsdev=/sys/bus/mdev/devices/00000000-0000-0000-0000-000000008888,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: warning: vfio 00000000-0000-0000-0000-000000008888: Could not enable error recovery for the device
2024-05-03 09:29:45 start remote tunnel
2024-05-03 09:29:46 ssh tunnel ver 1
2024-05-03 09:29:46 starting online/live migration on unix:/run/qemu-server/8888.migrate
2024-05-03 09:29:46 set migration capabilities
2024-05-03 09:29:46 migration speed limit: 600.0 MiB/s
2024-05-03 09:29:46 migration downtime limit: 100 ms
2024-05-03 09:29:46 migration cachesize: 2.0 GiB
2024-05-03 09:29:46 set migration parameters
2024-05-03 09:29:46 start migrate command to unix:/run/qemu-server/8888.migrate
2024-05-03 09:29:46 migrate uri => unix:/run/qemu-server/8888.migrate failed: VM 8888 qmp command 'migrate' failed - VFIO device doesn't support migration
2024-05-03 09:29:47 ERROR: online migrate failure - VM 8888 qmp command 'migrate' failed - VFIO device doesn't support migration
2024-05-03 09:29:47 aborting phase 2 - cleanup resources
2024-05-03 09:29:47 migrate_cancel
2024-05-03 09:30:01 ERROR: migration finished with problems (duration 00:00:21)
migration problems

Now, judging from spirit's post, his initial issue was related to the vfio migration building option in the Nvidia driver, but in my current case, this is not the option since I rebuild the drive with the NV_VFIO_DEVICE_MIG_STATE_PRESENT flags and dmegs states that both drive and kernel supports it, so I'm in a bit of a loopwhole right now.

Searching the "online migrate failure" notice, I find in /usr/share/perl5/PVE/AbstractMigrate.pm the comment ~ 190 line "vm is now owned by other side", which leads me to understand that the remote node is not picking up in the migration process.

Both nodes are build the same, in mirror, both in hardware and software, using P4's in both. I have attached the logs from the destination node.

I've been re-reading this thread and testing all over again, just to spot if I'm missing any important info on this process, but right now, any suggestion will be appreciated.

Thanks,
Alex
Code:
==> /var/log/syslog <==
May  3 09:39:55 testpve02 pmxcfs[1335]: [status] notice: received log
May  3 09:39:55 testpve02 systemd[1]: Started Session 210 of user root.


May  3 09:39:56 testpve02 systemd[1]: session-210.scope: Succeeded.
May  3 09:39:56 testpve02 systemd[1]: Started Session 211 of user root.
May  3 09:39:56 testpve02 systemd[1]: session-211.scope: Succeeded.
May  3 09:39:56 testpve02 systemd[1]: Started Session 212 of user root.
May  3 09:39:56 testpve02 systemd[1]: session-212.scope: Succeeded.
May  3 09:39:57 testpve02 systemd[1]: Started Session 213 of user root.
May  3 09:39:58 testpve02 qm[35855]: <root@pam> starting task UPID:testpve02:00008C18:000B6B86:663486BE:qmstart:8888:root@pam:
May  3 09:39:58 testpve02 qm[35864]: start VM 8888: UPID:testpve02:00008C18:000B6B86:663486BE:qmstart:8888:root@pam:
May  3 09:39:58 testpve02 kernel: [ 7484.235534] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000008888: Adding to iommu group 52

==> /var/log/kern.log <==
May  3 09:39:58 testpve02 kernel: [ 7484.235534] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000008888: Adding to iommu group 52

==> /var/log/syslog <==
May  3 09:39:58 testpve02 systemd[1]: Started 8888.scope.
May  3 09:39:58 testpve02 systemd-udevd[35868]: Using default interface naming scheme 'v247'.
May  3 09:39:58 testpve02 systemd-udevd[35868]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
May  3 09:39:59 testpve02 kernel: [ 7485.692641] device tap8888i0 entered promiscuous mode

==> /var/log/kern.log <==
May  3 09:39:59 testpve02 kernel: [ 7485.692641] device tap8888i0 entered promiscuous mode

==> /var/log/syslog <==
May  3 09:39:59 testpve02 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap8888i0
May  3 09:39:59 testpve02 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named tap8888i0
May  3 09:39:59 testpve02 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln8888i0
May  3 09:39:59 testpve02 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln8888i0
May  3 09:39:59 testpve02 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl -- add-port vmbr0 tap8888i0 tag=3204 -- set Interface tap8888i0 mtu_request=9000
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[816]: Nv0000CtrlVgpuGetStartDataParams {#012    mdev_uuid: {00000000-0000-0000-0000-000000008888},#012    config_params: "vgpu_type_id=62",#012    qemu_pid: 35877,#012    gpu_pci_id: 0x800,#012    vgpu_id: 2,#012    gpu_pci_bdf: 2048,#012}
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_env_log: vmiop-env: guest_max_gpfn:0x0
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_env_log: (0x0): Received start call from nvidia-vgpu-vfio module: mdev uuid 00000000-0000-0000-0000-000000008888 GPU PCI id 00:08:00.0 config params vgpu_type_id=62
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_env_log: (0x0): pluginconfig: vgpu_type_id=62
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_env_log: Successfully updated env symbols!
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: NvA081CtrlVgpuConfigGetVgpuTypeInfoParams {#012    vgpu_type: 62,#012    vgpu_type_info: NvA081CtrlVgpuInfo {#012        vgpu_type: 62,#012        vgpu_name: "GRID P40-1B",#012        vgpu_class: "NVS",#012        vgpu_signature: [],#012        license: "GRID-Virtual-PC,2.0;Quadro-Virtual-DWS,5.0;GRID-Virtual-WS,2.0;GRID-Virtual-WS-Ext,2.0",#012        max_instance: 24,#012        num_heads: 4,#012        max_resolution_x: 5120,#012        max_resolution_y: 2880,#012        max_pixels: 16384000,#012        frl_config: 45,#012        cuda_enabled: 0,#012        ecc_supported: 0,#012        gpu_instance_size: 0,#012        multi_vgpu_supported: 0,#012        vdev_id: 0x1b3811e7,#012        pdev_id: 0x1b38,#012        profile_size: 0x40000000,#012        fb_length: 0x38000000,#012        gsp_heap_size: 0x0,#012        fb_reservation: 0x8000000,#012        mappable_video_size: 0x400000,#012        encoder_capacity: 0x64,#012        bar1_length: 0x100,#012        frl_enable: 1,#012        adapter_name: "GRID P40-1B",#012        adapter_name_unicode: "GRID P40-1B",#012        short_gpu_name_string: "GP104GL-A",#012        licensed_product_name: "NVIDIA Virtual PC",#012        vgpu_extra_params: "",#012        ftrace_enable: 0,#012        gpu_direct_supported: 0,#012        nvlink_p2p_supported: 0,#012        multi_vgpu_exclusive: 0,#012        exclusive_type: 0,#012        exclusive_size: 1,#012        gpu_instance_profile_id: 4294967295,#012    },#012}
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Applying profile nvidia-62 overrides
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/num_heads: 4 -> 1
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/max_resolution_x: 5120 -> 1600
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/max_resolution_y: 2880 -> 1200
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/max_pixels: 16384000 -> 1920000
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/cuda_enabled: 0 -> 1
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/fb_length: 939524096 -> 436207616
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/fb_reservation: 134217728 -> 100663296
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: Patching nvidia-62/frl_enable: 1 -> 1
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: cmd: 0xa0810115 failed.
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Setting mappable_cpu_host_aperture to 10M
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): gpu-pci-id : 0x800
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): vgpu_type : NVS
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Framebuffer: 0x1a000000
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Virtual Device Id: 0x1b38:0x11e7
May  3 09:40:00 testpve02 kernel: [ 7485.861443] NVRM: Software scheduler timeslice set to 1041uS.

==> /var/log/kern.log <==
May  3 09:40:00 testpve02 kernel: [ 7485.861443] NVRM: Software scheduler timeslice set to 1041uS.

==> /var/log/syslog <==
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): FRL Value: 45 FPS
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: ######## vGPU Manager Information: ########
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: Driver Version: 535.161.05
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Detected ECC enabled on physical GPU.
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Guest usable FB size is reduced due to ECC.
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): This vGPU type does not support ECC.
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): vGPU supported range: (0x70001, 0x120001)
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): Init frame copy engine: syncing...
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): vGPU migration enabled
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: (0x0): vGPU manager is running in non-SRIOV mode.
May  3 09:40:00 testpve02 nvidia-vgpu-mgr[35910]: notice: vmiop_log: display_init inst: 0 successful
May  3 09:40:00 testpve02 kernel: [ 7485.929581] [nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000008888: vGPU migration enabled with upstream V2 migration protocol

==> /var/log/kern.log <==
May  3 09:40:00 testpve02 kernel: [ 7485.929581] [nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000008888: vGPU migration enabled with upstream V2 migration protocol

==> /var/log/syslog <==
May  3 09:40:00 testpve02 qm[35855]: <root@pam> end task UPID:testpve02:00008C18:000B6B86:663486BE:qmstart:8888:root@pam: OK
May  3 09:40:00 testpve02 systemd[1]: session-213.scope: Succeeded.
May  3 09:40:00 testpve02 systemd[1]: session-213.scope: Consumed 1.485s CPU time.
May  3 09:40:00 testpve02 systemd[1]: Started Session 214 of user root.
May  3 09:40:01 testpve02 CRON[35926]: (root) CMD (csync2 -x &>/dev/null)
May  3 09:40:03 testpve02 systemd[1]: Started Session 216 of user root.
May  3 09:40:05 testpve02 qm[35954]: <root@pam> starting task UPID:testpve02:00008C74:000B6E39:663486C5:qmstop:8888:root@pam:
May  3 09:40:05 testpve02 qm[35956]: stop VM 8888: UPID:testpve02:00008C74:000B6E39:663486C5:qmstop:8888:root@pam:
May  3 09:40:05 testpve02 QEMU[35877]: kvm: terminating on signal 15 from pid 35956 (task UPID:testpve02:00008C74:000B6E39:663486C5:qmstop:8888:root@pam:)
May  3 09:40:05 testpve02 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln8888i0
May  3 09:40:05 testpve02 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln8888i0
May  3 09:40:05 testpve02 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap8888i0
May  3 09:40:05 testpve02 qmeventd[804]: read: Connection reset by peer
May  3 09:40:05 testpve02 systemd[1]: 8888.scope: Succeeded.
May  3 09:40:05 testpve02 systemd[1]: 8888.scope: Consumed 1.813s CPU time.
May  3 09:40:15 testpve02 kernel: [ 7501.095405] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000008888: Removing from iommu group 52

==> /var/log/kern.log <==
May  3 09:40:15 testpve02 kernel: [ 7501.095405] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000008888: Removing from iommu group 52

==> /var/log/syslog <==
May  3 09:40:15 testpve02 qm[35954]: <root@pam> end task UPID:testpve02:00008C74:000B6E39:663486C5:qmstop:8888:root@pam: OK
May  3 09:40:15 testpve02 systemd[1]: session-216.scope: Succeeded.
May  3 09:40:15 testpve02 systemd[1]: session-216.scope: Consumed 1.587s CPU time.
May  3 09:40:15 testpve02 systemd[1]: session-214.scope: Succeeded.
May  3 09:40:15 testpve02 systemd[1]: session-214.scope: Consumed 1.680s CPU time.
May  3 09:40:16 testpve02 systemd[1]: Started Session 217 of user root.
May  3 09:40:16 testpve02 systemd[1]: session-217.scope: Succeeded.
May  3 09:40:16 testpve02 pmxcfs[1335]: [status] notice: received log
 
this error
2024-05-03 09:29:47 ERROR: online migrate failure - VM 8888 qmp command 'migrate' failed - VFIO device doesn't support migration
especially this part `VFIO device doesn't support migration` comes from QEMU itself. I don't know why it says the device does not support migration, but maybe it has something to do with:
applied the unlock patch
?
what exactly do you mean with this? (since the tesla p4 is officially supported, there shouldn't be any need to patch the driver if that's what you're doing here)
 
Indeed, P4 is supported as I could find before in the official docs.

I have also tried using both non-patched and patched versions (from polloloco) on the 535.161.05 base gpu software driver. Before swapping the patched and non-patched version I used to uninstall the drivers gracefully, but the output is the same.

I can see some details in the logs (both on source - upon VM starting) and on destination upon migration start (before it fails) writting the following:

Code:
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Setting mappable_cpu_host_aperture to 10M
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): gpu-pci-id : 0x800
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): vgpu_type : NVS
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Framebuffer: 0x1a000000
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Virtual Device Id: 0x1b38:0x11e7
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): FRL Value: 45 FPS
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: ######## vGPU Manager Information: ########
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: Driver Version: 535.161.05
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Detected ECC enabled on physical GPU.
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Guest usable FB size is reduced due to ECC.
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): This vGPU type does not support ECC.
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): vGPU supported range: (0x70001, 0x120001)
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): Init frame copy engine: syncing...
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): vGPU migration enabled
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: (0x0): vGPU manager is running in non-SRIOV mode.
May  3 15:42:08 testpve02 nvidia-vgpu-mgr[3251]: notice: vmiop_log: display_init inst: 0 successful

I will dig it further, maybe a new idea pops out.


BR,
Alex
 
Last edited:
you could use git and build the packages yourselves, but there's no other way currently
i'll see that i ping the patches on the list, so that someone'll review them and they (hopefully) can land in our codebase soon
 
Maybe something changed in the meantime, because I didn't have to make any modifications to "/usr/share/perl5/PVE/QemuServer/PCI.pm" nor in "/usr/share/perl5/PVE/QemuMigrate.pm" to online migrate VM with vGPU. I just tested live migration with Nvidia A40 and it works fine for "Raw Device".
You can't do it through GUI but rather use command qm migrate <vmid> --force. Funny that option "--force" doesn't help with "Mapped Device", which still fails with "can't migrate running VM which uses mapped devices:" error. Wouldn't it make sense that "--force" also enables migration of VMs that use mapped devices?
 
  • Like
Reactions: eexodus

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!