Failed to destroy vGPU device.

Ok apparently the mdevs are also included during a backup. For example, I have a VM which never actually runs, but it has a vgpu assigned to it.
yes, the way backup for stopped vms works is that it has to start a vm process in 'stop' mode (nothing is actually executed) and the stopped again

all in all, maybe someone of you could report this as a bug to nvidia?
namely especially this error:
i have error Failed to destroy vGPU,

1674013618921.png
when the mdev is already cleaned up.. maybe it's just an oversight on their part?
 
/sys/bus/pci/devices/0000:c2:00

https://www.youtube.com/watch?v=_36yNWw_07g&t=6s

I'm wondering if we could simply add an "if" to check if the device is an NVidia card and if the VM linked is already in stop mode, execute the code we commented out, otherwise skip that bit.

Would that be a good alternative to dealing with NVidia?
I think the issue is as previously mentioned, it depends on driver version. But I am out of my depth here.

Interestingly I am getting the same issue on 14.4 now too.
 
Hi all.

I came back to this project after a while for another project.

I switched back to using the nvidia 510.108.03 driver (vGPU Software: 14.4). And all the above errors are gone. Maybe it crashes with vGPU Software: 15.

But i am getting 1 very strange error with windows vm. During the first boot I noticed that the VM's vGPU was not loading the driver, leading to a blue screen error. And the driver is only loaded after it restarts again.

1675315522974.png

1675315540614.png

I'm wondering if it's due to starting the VM with the start of the vGPU. Because the blue screen only occurs when I shutdown the VM from within the OS or the shutdown button on the proxmox GUI.
 
Hi all.

I came back to this project after a while for another project.

I switched back to using the nvidia 510.108.03 driver (vGPU Software: 14.4). And all the above errors are gone. Maybe it crashes with vGPU Software: 15.

But i am getting 1 very strange error with windows vm. During the first boot I noticed that the VM's vGPU was not loading the driver, leading to a blue screen error. And the driver is only loaded after it restarts again.

View attachment 46312

View attachment 46313

I'm wondering if it's due to starting the VM with the start of the vGPU. Because the blue screen only occurs when I shutdown the VM from within the OS or the shutdown button on the proxmox GUI.
Did you roll back the driver in the VM to a 510 branch driver or the equivalent guest driver?
 
Did you remove the newer driver with DDU? I was having lots of issues until I ensured the newer driver was completely removed.
 
all in all, maybe someone of you could report this as a bug to nvidia?

I wrote a mail to nvidia's enterprise support. Will report back if they answer me.


EDIT: Their answer postet below.

Unfortunately KVM/Qemu (Proxmox VE) is not on the list of supported Linux KVM hypervisors for vGPU 15.x software: https://docs.nvidia.com/grid/15.0/g...x-kvm/index.html#hypervisor-software-versions

Besides generic drivers for Linux KVM, NVIDIA also has specific drivers for RHEL KVM and Ubuntu:

https://docs.nvidia.com/grid/15.0/g...l-kvm/index.html#hypervisor-software-versions

and

https://docs.nvidia.com/grid/15.0/g...buntu/index.html#hypervisor-software-versions


Please use one of the supported hypervisors.

Also, please note that if a specific release, even an update release, is not listed, it’s not supported.
 
Last edited:
But i am getting 1 very strange error with windows vm. During the first boot I noticed that the VM's vGPU was not loading the driver, leading to a blue screen error. And the driver is only loaded after it restarts again.
I noticed the same on my end. Rebooting the VM does indeed clear the error. I got puzzled for a bit when cloning VMs trying to find why they would all BSOD till I figured that stopping/restarting them solved the issue.

Back on the original topic though, I'm still not sure what's the best path to have my tesla card running in VMs without having to reboot the whole server on VM shutdown. I'm still wondering if what I asked here would make sense.
 
Ok after changing commenting out the lines 6127, 6128 (which are 6099,6100 on my local /usr/share/perl5/PVE/QemuServer.pm) to

Perl:
sub cleanup_pci_devices {
    my ($vmid, $conf) = @_;

    foreach my $key (keys %$conf) {
        next if $key !~ m/^hostpci(\d+)$/;
        my $hostpciindex = $1;
        my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
        my $d = parse_hostpci($conf->{$key});
        if ($d->{mdev}) {
            sleep(3); # also added by me!
           # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
           # don't want to break ABI just for this two liner
#           my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
#           PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
        }
    }
    PVE::QemuServer::PCI::remove_pci_reservation($vmid);
}

the shutdown works as expected.

1. Start PVE-Host
2. Auto-Start Guests were started
3. cat /sys/bus/pci/devices/0000\:c2\:00.0/mdev_supported_types/nvidia-269/available_instances shows 2 (which is correct)
4. Shutdown guest via Proxmox-GUI
5. cat /sys/bus/pci/devices/0000\:c2\:00.0/mdev_supported_types/nvidia-269/available_instances shows 3 (which is correct, before we had 2)
6. Starting guest via Proxmox -GUI again
7. cat /sys/bus/pci/devices/0000\:c2\:00.0/mdev_supported_types/nvidia-269/available_instances shows 2 (which is correct, before we had 1)

Hi,

We got the same problem, I made some tests with the modification of the code only adding the sleep(15) and it works well. The mdev created devices are destroyed when the machine shutdown (GUI and SSH), when the VM is stopped from de GUI mdev devices still created, and then I need to remove then with and echo 1 > /sys/bus/mdev/devices/$UUID/remove. We haven't viewed more stack-trace errors in dmesg logs.

/usr/share/perl5/PVE/QemuServer.pm
Perl:
sub cleanup_pci_devices {
    my ($vmid, $conf) = @_;

    sleep(15); # Added by me, credits to xenon96

    foreach my $key (keys %$conf) {
        next if $key !~ m/^hostpci(\d+)$/;
        my $hostpciindex = $1;
        my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
        my $d = parse_hostpci($conf->{$key});
        if ($d->{mdev}) {
           # sleep(5); # Modification by Xenon96
           # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
           # don't want to break ABI just for this two liner
           my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
           PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
        }
    }
    PVE::QemuServer::PCI::remove_pci_reservation($vmid);
}

Thanks to all for the work!!!
 
Last edited:
  • Like
Reactions: AbsolutelyFree
@aslopez_irontec You mention "sleep(5)" in your comment but show "sleep(15)" in the code; did you go extra-cautious with 15 or is it a typo in your comment?

Thanks for sharing your results!
 
I am having the exact same issue here. Every time I shutdown a VM with a vGPU attached to it, the vGPU process is still running according to nvidia-smi and I see the same errors in dmesg as was posted in the OP. Running a currently fully up-to-date proxmox 7.3-6 with nvidia vgpu driver version 15.0.

One thing that I noticed that hasn't been mentioned so far is that if I shutdown or reboot the proxmox node that failed to destroy the vgpu device, without manually killing that vGPU process first, the proxmox node has another kernel panic as it is shutting down which prevents the system from being shutdown. I left it up for a bit over an hour to see if it would resolve itself and fully complete the reboot on it's own, but I had to manually power cycle it. This issue can be avoided though by running nvidia-smi on that node, getting the PID of the vGPU process, and killing it manually.

Hi,

We got the same problem, I made some tests with the modification of the code only adding the sleep(15) and it works well. The mdev created devices are destroyed when the machine shutdown (GUI and SSH), when the VM is stopped from de GUI mdev devices still created, and then I need to remove then with and echo 1 > /sys/bus/mdev/devices/$UUID/remove. We haven't viewed more stack-trace errors in dmesg logs.

/usr/share/perl5/PVE/QemuServer.pm
Perl:
sub cleanup_pci_devices {
    my ($vmid, $conf) = @_;

    sleep(15); # Added by me, credits to xenon96

    foreach my $key (keys %$conf) {
        next if $key !~ m/^hostpci(\d+)$/;
        my $hostpciindex = $1;
        my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
        my $d = parse_hostpci($conf->{$key});
        if ($d->{mdev}) {
           # sleep(5); # Modification by Xenon96
           # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
           # don't want to break ABI just for this two liner
           my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
           PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
        }
    }
    PVE::QemuServer::PCI::remove_pci_reservation($vmid);
}

Thanks to all for the work!!!

I can confirm that the fix proposed here does work on my system as well. After adding this line to /usr/share/perl5/PVE/QemuServer.pm, shutting down a VM from the PVE GUI properly terminates the vGPU process. I will update if I encounter any issues. Thanks to @xenon96 and @aslopez_irontec for the temporary fix!

Proxmox devs, if you need a tester for a patch I am happy to help.
 
Last edited:
Thanks @dcsapak . That's the "if" condition I was looking for. I merged your code in my instance and I'll report back if anything breaks although I don't see why it would.

Thanks again!

---

For the records, here is what the function now looks like on my end:
Code:
sub cleanup_pci_devices {
    my ($vmid, $conf) = @_;

    foreach my $key (keys %$conf) {
        next if $key !~ m/^hostpci(\d+)$/;
        my $hostpciindex = $1;
        my $uuid = PVE::SysFSTools::generate_mdev_uuid($vmid, $hostpciindex);
        my $d = parse_hostpci($conf->{$key});
        if ($d->{mdev}) {
            # NOTE: avoid PVE::SysFSTools::pci_cleanup_mdev_device as it requires PCI ID and we
            # don't want to break ABI just for this two liner
            my $dev_sysfs_dir = "/sys/bus/mdev/devices/$uuid";
            # some nvidia vgpu driver versions want to clean the mdevs up themselves, and error
            # out when we do it first. so wait for 10 seconds and then try it
            my $pciid = $d->{pciid}->[0]->{id};
            my $info = PVE::SysFSTools::pci_device_info("$pciid");
            if ($info->{vendor} eq '10de') {
                sleep 10;
            }
            PVE::SysFSTools::file_write("$dev_sysfs_dir/remove", "1") if -e $dev_sysfs_dir;
        }
    }
    PVE::QemuServer::PCI::remove_pci_reservation($vmid);
}
 
  • Like
Reactions: kevinshane
Did you roll back the driver in the VM to a 510 branch driver or the equivalent guest driver?
Hi.

I have checked all the configurations. The blue screen error only appears on windows 10 and windows 11 OS. All version windows server work normally.

when we check the log we notice there is a log line below when blue screen:

Mar 28 17:22:41 gpu1 nvidia-vgpu-mgr[11379]: error: vmiop_log: (0x0): RPC RINGs are not valid

i didn't find anything else related to this error other than profiles vgpu. But with 1 profile we didn't get an error with windows server so I don't think it's really the cause of the above error.
 
Hi Guys,
I believe that after applying the workaround, whenever a MV is destroyed, proxmox does not release the vGPU properly and therefore it does not allow to use it back. I do believe that rebooting the system will allow me to have the Vgpus available that is something I would like to avoid.

[6045009.681314] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000000175: Removing from iommu group 349
[6045009.681673] nvidia-vgpu-vfio 00000000-0000-0000-0000-000000000175: MDEV: detaching iommu
[6045009.682153] nvidia 0000:d0:00.5: MDEV: Unregistering
[6045009.682591] nvidia 0000:d0:00.6: MDEV: Unregistering
[6045009.683099] nvidia 0000:d0:00.7: MDEV: Unregistering
[6045009.683469] nvidia 0000:d0:01.0: MDEV: Unregistering
[6045009.683893] nvidia 0000:d0:01.1: MDEV: Unregistering
[6045009.684236] nvidia 0000:d0:01.2: MDEV: Unregistering
[6045009.684583] nvidia 0000:d0:01.3: MDEV: Unregistering
[6045009.684926] nvidia 0000:d0:01.4: MDEV: Unregistering
[6045009.685359] nvidia 0000:d0:01.5: MDEV: Unregistering
[6045009.685799] nvidia 0000:d0:01.6: MDEV: Unregistering
[6045009.686284] nvidia 0000:d0:01.7: MDEV: Unregistering
[6045009.686720] nvidia 0000:d0:02.0: MDEV: Unregistering
[6045009.687070] nvidia 0000:d0:02.1: MDEV: Unregistering
[6045009.687404] nvidia 0000:d0:02.2: MDEV: Unregistering
[6045009.687773] nvidia 0000:d0:02.3: MDEV: Unregistering
[6045009.782248] nvidia 0000:d0:00.0: driver left SR-IOV enabled after remove
[6045009.784155] vfio-pci 0000:d0:00.0: Cannot bind to PF with SR-IOV enabled
[6045009.784433] vfio-pci: probe of 0000:d0:00.0 failed with error -16

Right now, I can only see "half" my GPU on the host:

Wed Apr 26 14:10:02 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.07 Driver Version: 525.85.07 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A16 On | 00000000:CE:00.0 Off | 0 |
| 0% 45C P8 17W / 62W | 10944MiB / 15356MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA A16 On | 00000000:CF:00.0 Off | 0 |
| 0% 54C P0 32W / 62W | 7296MiB / 15356MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3463262 C+G vgpu 3648MiB |
| 0 N/A N/A 4055010 C+G vgpu 7296MiB |
| 1 N/A N/A 2820335 C+G vgpu 3648MiB |
| 1 N/A N/A 3153685 C+G vgpu 3648MiB |
+-----------------------------------------------------------------------------+

Any ideas?

Best Regards
 
I believe that after applying the workaround, whenever a MV is destroyed, proxmox does not release the vGPU properly and therefore it does not allow to use it back. I do believe that rebooting the system will allow me to have the Vgpus available that is something I would like to avoid.

The workaround is no longer necessary since PVE-7.4 as it has been integrated into the proxmox qemu server (a bit better :p).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!