Nvidia Supported GPU with vGPU and Licensing

laxarus

New Member
Aug 4, 2023
3
0
1
I am so confused about this.
I have read these
https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE
https://gitlab.com/polloloco/vgpu-proxmox
https://git.collinwebdesigns.de/oscar.krause/fastapi-dls

I have found a good deal for an RTX 6000 Ada 48GB and want to use it on my proxmox system but I am not sure how to go about the licensing and enabling vgpu. Before I spend some $$$ (It is still expensive even with a deal), I want to make sure that I can enable vgpu.

The proxmox docs refer to the nvidia pages for licensing and stuff but did they write these pages to confuse people more?

From what I understand,
- This card supports vGPU 17. https://docs.nvidia.com/vgpu/gpus-supported-by-vgpu.html
- My host system (SYS-221H-TNR) is also supported. https://www.nvidia.com/en-us/data-c...count=50&pageNumber=1&searchTerm=SYS-221H-TNR
- I have kernel 6.8.12-4-pve and Proxmox 8.3.0
- I need to use the NVIDIA Display Mode Selector Tool to change the mod of this card
- I need to use NVIDIA Linux Driver Version 550.127.05 and NVIDIA Virtual GPU Manager Version 550.127.06 https://docs.nvidia.com/vgpu/17.0/grid-vgpu-release-notes-generic-linux-kvm/index.html. Here starts the confusion


After this everything becomes confusing,
Polloloco has a note about following nvidia docs if the card is Ada generation.

If you have GPUs from the Ampere and Ada Lovelace generation, you are out of luck, unless you have a vGPU qualified card from this list like the A5000 or RTX 6000 Ada. If you have one of those cards, please consult the NVIDIA documentation for help with setting it up.

and when referring to nvidia docs, it becomes more confusing.

From all the reading I did,
  1. I need to create an evaluation account on nvidia to access the drivers.
    • But what about the GPU Manager?
    • which hypervisor? Linux with KVM or Ubuntu? Proxmox docs refer Ubuntu debian. are they the same
  2. Install these drivers on proxmox host
  3. Create a licensing server with fastapi-dls (But where? on another VM? Proxmox does not have docker?)
  4. Install guest drivers on the VM
    • Are they the same as the host drivers?
    • Do I still need to use vgpu_unlock-rs even though I have a supported card.
Anyway, there are multiple questions and the more I read the more confused I am. Any help at this point is appreciated.
 
In order to use vGPU with nVidia you need to pay a licence fee. Using fastapi-dls circumvents the licence server. So understand that if you use it you will be breaching their licencing conditions and be comfortable with what that means.

which hypervisor? Linux with KVM or Ubuntu?
Linux with KVM

Create a licensing server with fastapi-dls (But where? on another VM? Proxmox does not have docker?)
You can create an LXC running Ubuntu or Debian and install fastapi-dls on it

Install guest drivers on the VM
  • Are they the same as the host drivers?
No there are separate guest drivers depending on the guest OS. These are included in the vGPU driver package which you download from nVidia.

Do I still need to use vgpu_unlock-rs even though I have a supported card.
No, the polloloco instructions are for pre Ampere cards.


Once you have the guest drivers installed you will then need to register it on the guest with your fastapi-dls server
 
I've setup a license serve on nvidias cloud platform - how to connect the host / guest to the license server and how to split the vgpus?

Sorry, im totally confused, and Nvidia is rejecting support for especially "proxmox" since this is not "officially supported"
 
Follow the steps in https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE to get the drivers installed on your Proxmox host. Read them carefully.

You have an Ampere based card so will need to enable SR-IOV both in the BIOS and on Proxmox.

Have you done this, and do you get a list of virtual functions with the command lspci -d 10de:?

Assign a VFS to your guest and install the grid drivers as per https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE#Guest_Configuration. You only need to install lightdm an x11vnc on Ubuntu if you plan on using a GUI. If it will just be the CLI then you don't need them.

If you've set up a licence on nVidia's cloud server then follow the steps at https://git.collinwebdesigns.de/oscar.krause/fastapi-dls#setup-client, but substitute the nVidia cloud licensing server's URL for <dls-hostname-or-ip> in the step to download the client token.
 
Thank you!

Now I sorted out a few things to get myself understanding and sharing the info with you who have the same issues as me:

1. The Profiles chosen are related to the license you need:
A profile need vAPPS license
B Profile needs vPC license
Q profile needs vWS license (which costs 10 times as much as vPC)

See documentation about the differences:
Whatch out for max res in Profile A (vAPPS)
What out for max Memory in Profile B (vPC) only supporting 2GB of vram.
Yes. Nvidia wants you to take the vWS licenses ;)

1732457358853.png

2. Get the list of supported Profiles:
Code:
root@pve-21:/sys/bus/pci/devices/0000:41:00.4/nvidia# cat creatable_vgpu_types
ID    : vGPU Name
742   : NVIDIA A2-1B
743   : NVIDIA A2-2B
744   : NVIDIA A2-1Q
745   : NVIDIA A2-2Q
746   : NVIDIA A2-4Q
747   : NVIDIA A2-8Q
748   : NVIDIA A2-16Q
749   : NVIDIA A2-1A
750   : NVIDIA A2-2A
751   : NVIDIA A2-4A
752   : NVIDIA A2-8A
753   : NVIDIA A2-16A

3. Set your profile like this (in my case A2-2B):
Code:
root@pve-21:/sys/bus/pci/devices/0000:41:00.4/nvidia# echo 743 > current_vgpu_type

You must shutdown the VMs using any vGPU before being able to edit the current_vgpu_type
3.2 Assign VFIO to your VM:

Shoutout to guruevi: https://forum.proxmox.com/threads/vgpu-with-nvidia-on-kernel-6-8.150840/


4. The licensing is happening within the VM itself. In windows for example you need to get the token file from your license server (DLS onpremise, CLS on nvidia cloude) and copy it to:
Code:
%SystemDrive%\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken

To setup your license server you need to visit nvid.nvidia.com.

5. Check the licensing status in the nvidia control panel under licensing.
If error, check here:
Code:
C:\Users\Public\Documents\NvidiaLogging\Log.NVDisplay.Container.exe.log


This took me several hours to understand - hope you can skip the hastle with these infos :)

If someone sees enhancements or error correction here, please tell me :) This worked for me so far.
 
Last edited:
  • Like
Reactions: kesawi
I have a Nvidia A40 GPU - this is supported
I have proxmox 8.3.0 - With Kernel 6.5.13-6-pve pinned and installed nvidia VGPU Linux drivers - 535.161.05
I have got the responce from nvidia-smi, nvidia-smi vgpu, mdevctl types... all ok
i can even assign the PCIe-vGPU to the VM but cant start it

i get the below errors

qm start 100
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/00000000-0000-0000-0000-000000000100,id=hostpci0,bus=pci.0,addr=0x10: vfio 00000000-0000-0000-0000-000000000100: failed to setup container for group 63: Failed to set iommu for container: Invalid argument
stopping swtpm instance (pid 27866) due to QEMU startup error
waited 10 seconds for mediated device driver finishing clean up
actively clean up mediated device with UUID 00000000-0000-0000-0000-000000000100
start failed: QEMU exited with code 1

attached is the screenshot from the ipmi console when we try to start the vm

i have tried everything,,, any bright ideas,

Update: I tried downgrading to 7.4 with kernel 5.11 and kernel 6.2 but the same error.

failed to setup container for group 63: Failed to set iommu for container: Invalid argument

- I have tried some settings in the BIOS, disable CSM, reset to defaults etc... everying but nothing seems to help.

looks like ASUS KRPA-U16 might be the culprit - any ideas or suggestions
 

Attachments

  • Screenshot 2024-11-30 144453.jpg
    Screenshot 2024-11-30 144453.jpg
    113.5 KB · Views: 6
Last edited:
I have a Nvidia A40 GPU - this is supported
I have proxmox 8.3.0 - With Kernel 6.5.13-6-pve pinned and installed nvidia VGPU Linux drivers - 535.161.05
I have got the responce from nvidia-smi, nvidia-smi vgpu, mdevctl types... all ok
i can even assign the PCIe-vGPU to the VM but cant start it

i get the below errors

qm start 100
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/00000000-0000-0000-0000-000000000100,id=hostpci0,bus=pci.0,addr=0x10: vfio 00000000-0000-0000-0000-000000000100: failed to setup container for group 63: Failed to set iommu for container: Invalid argument
stopping swtpm instance (pid 27866) due to QEMU startup error
waited 10 seconds for mediated device driver finishing clean up
actively clean up mediated device with UUID 00000000-0000-0000-0000-000000000100
start failed: QEMU exited with code 1

attached is the screenshot from the ipmi console when we try to start the vm

i have tried everything,,, any bright ideas,

Update: I tried downgrading to 7.4 with kernel 5.11 and kernel 6.2 but the same error.

failed to setup container for group 63: Failed to set iommu for container: Invalid argument

How can one disagree? Of course, I understand that everyone has their own opinion, but still? I respect people who have their own thoughts in their heads. There’s simply no adding or subtracting here. Keep up the good work as they say. This is already normal, wow. I can't even imagine this. I'm just getting goosebumps right now. I also sometimes assignment help use this but not as often as I would like. Everything has its own rules. In general, I like your opinion. That's how it should be!

- I have tried some settings in the BIOS, disable CSM, reset to defaults etc... everying but nothing seems to help.

looks like ASUS KRPA-U16 might be the culprit - any ideas or suggestions
The first time I don’t even know what to write.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!