Using NVidia Tesla P40/P41

hlab · May 3, 2023

Hi,
I'm setting up new system and going to use Tesla P41 as GPU/accelerator.
Is following https://gitlab.com/polloloco/vgpu-proxmox enough, anything else I need to be aware of?

Thanks

noel. · May 4, 2023

Hi,

Just to make sure you know the difference between passing through a GPU to a VM and splitting up a GPU into vGPUs and then passing those through to multiple VMs. The big drawback with vGPU is that the drivers are expensive [1]. They do, however, have a 90 day free trial (I am not sure, but those trial drivers might throttle after 1h of use until you reboot, but don't quote me on that).

Depending on your setup, one of the two solutions might be better. Keep in mind that a lot of situations do not need a GPU: hosting a website, NAS, smarthome stuff, even running a media server might be fine without a GPU for encoding. For other situations like gaming, modelling or machine learning, you will want a GPU.

If, for example, your setup is just for you and you want a VM for gaming and one for CAD work, it's probably easier and cheaper to set up one gaming VM and one work VM, pass through the same card [2][3] and just turn off one VM as needed.

If, instead, you want to set up a system where 4 different people are doing CAD work, vGPUs might be the way to go.
There is also a little section (10.9.4) in our docs on vGPUs [4]. Otherwise, on first glance, the link you posted looks like a good tutorial.

Good luck

[1]: https://www.nvidia.com/content/dam/...Virtual-GPU-Packaging-and-Licensing-Guide.pdf
[2]: https://pve.proxmox.com/wiki/PCI(e)_Passthrough
[3]: https://pve.proxmox.com/wiki/PCI_Passthrough
[4]: https://pve.proxmox.com/pve-docs/pve-admin-guide.html

hlab · May 4, 2023

noel. said:
Hi,

Just to make sure you know the difference between passing through a GPU to a VM and splitting up a GPU into vGPUs and then passing those through to multiple VMs. The big drawback with vGPU is that the drivers are expensive [1]. They do, however, have a 90 day free trial (I am not sure, but those trial drivers might throttle after 1h of use until you reboot, but don't quote me on that).

Depending on your setup, one of the two solutions might be better. Keep in mind that a lot of situations do not need a GPU: hosting a website, NAS, smarthome stuff, even running a media server might be fine without a GPU for encoding. For other situations like gaming, modelling or machine learning, you will want a GPU.

If, for example, your setup is just for you and you want a VM for gaming and one for CAD work, it's probably easier and cheaper to set up one gaming VM and one work VM, pass through the same card [2][3] and just turn off one VM as needed.

If, instead, you want to set up a system where 4 different people are doing CAD work, vGPUs might be the way to go.
There is also a little section (10.9.4) in our docs on vGPUs [4]. Otherwise, on first glance, the link you posted looks like a good tutorial.

Good luck

[1]: https://www.nvidia.com/content/dam/...Virtual-GPU-Packaging-and-Licensing-Guide.pdf
[2]: https://pve.proxmox.com/wiki/PCI(e)_Passthrough
[3]: https://pve.proxmox.com/wiki/PCI_Passthrough
[4]: https://pve.proxmox.com/pve-docs/pve-admin-guide.html

Thanks a lot for all the info.
I am planning media server (2-3 streams) and NVR - which seem like good candidate for always on vGPU. Planning to explore running mycroft backend in future - maybe this counts as machine learning (it's not definitive just something I'm still thinking)

But I wasn't aware about evaluation drivers having limitation, let me look into this.
[EDIT]: I just read through the licensing guide, and it seems valid for softwares (vApps, vCS etc) not so much for drivers. If anyone else have better idea please let me know.

Also let me check if I can get away by running media server encoding and NVR decoding off of CPU instead of GPU. That way I can get away with dedicated GPU for gaming when needed.

Can I switch between vGPU and dedicated GPU implementation cleanly (is there similar tutorial showing how to switch from one to other)? While I test other (media server and NVR) services?

noel. · May 5, 2023

A correction/clarification on my part from yesterday: while that tutorial looks good, all the tedious things listed there are only needed if you are using a consumer card, which you are not. No need for you to patching the drivers, installing rust and all that. All you should need to do is set up IOMMU and download the correct drivers, see the docs and PCI/(e) wiki from my post above.

Regarding switching: You should be able to just run the Nvidia SR-IOV (see our docs) script to disable all the vGPU functions, then you should be able to pass through the whole card. If that does not work you can just blacklist the Nvidia drivers (see the PCI/(e) wiki pages), and reboot with the nouveau drivers. Then just pass through the card to the desired vm (You would also need to do this instead of the SR-IOV stuff if you want to use the GPU on your host device, I.E. if you have Gnome/KDE/whatever installed on your on your host and use it as a workstation or something).

All in all, as you said, just trying different configurations out and seeing what works for you sounds like a good idea. If I were you I would also read through the PCIe section in the docs and the PCI/(e) sections in our wiki carefully before installing stuff, just to get an idea what can be done and how it should be done.

Also, just a thought: it might be cheaper for you to get an additional cheap, secondhand GPU (have a look at some benchmarks [1] for reference) for streaming instead of paying for Nvidia drivers.

[1]: https://gpu.userbenchmark.com/

hlab · May 7, 2023

noel. said:
A correction/clarification on my part from yesterday: while that tutorial looks good, all the tedious things listed there are only needed if you are using a consumer card, which you are not. No need for you to patching the drivers, installing rust and all that. All you should need to do is set up IOMMU and download the correct drivers, see the docs and PCI/(e) wiki from my post above.

Regarding switching: You should be able to just run the Nvidia SR-IOV (see our docs) script to disable all the vGPU functions, then you should be able to pass through the whole card. If that does not work you can just blacklist the Nvidia drivers (see the PCI/(e) wiki pages), and reboot with the nouveau drivers. Then just pass through the card to the desired vm (You would also need to do this instead of the SR-IOV stuff if you want to use the GPU on your host device, I.E. if you have Gnome/KDE/whatever installed on your on your host and use it as a workstation or something).

All in all, as you said, just trying different configurations out and seeing what works for you sounds like a good idea. If I were you I would also read through the PCIe section in the docs and the PCI/(e) sections in our wiki carefully before installing stuff, just to get an idea what can be done and how it should be done.

Also, just a thought: it might be cheaper for you to get an additional cheap, secondhand GPU (have a look at some benchmarks [1] for reference) for streaming instead of paying for Nvidia drivers.

[1]: https://gpu.userbenchmark.com/

Ok so I've been reading about use cases (and setting up) and looks like I can run all services in LXC (instead of VM) - NAS, Jellyfin, Frigate.
So looks like I won't really need vGPU functionality, I'll revisit this for 3D modeling / gaming - however those are not always on machines so definitely can go with PCIe passthrough.

Search

Search

Using NVidia Tesla P40/P41

hlab

New Member

noel.

Active Member

hlab

New Member

noel.

Active Member

hlab

New Member

We value your privacy