I am trying to set up an Nvidia vGPU for AI workloads and I am following this guide: https://pve.proxmox.com/wiki/NVIDIA_vGPU_on_Proxmox_VE_7.x#
My card is an RTX A5000. I obtained the NVidia grid host driver (535.129) and installed it on the PVE host. I activated SR-IOV and now have several virtual devices that I can pass through to a VM.
I created a Debian 12 VM and installed the NVidia grid guest driver (535.129) in the VM. nvidia-smi is showing me the virtual card (with Cuda 12.2).
My troubles start when I try to set up Cuda.
I downloaded the official Cuda package (12.2; because it says somewhere that 12.3 is not compatible) from Nvidia. It wants to install a "driver" in addition to the Cuda Toolkit. If I let it install the driver (535.54), it deinstalls my vGPU guest driver and then tells me that itself is not compatible with the card found. So I thought, well, this seems to be a normal driver that I can do without (because I already have the grid guest driver). But if I don't let it install, it complains that the "Cuda driver" was not installed. So apparently, it is a Cuda driver after all. But why does it deinstall my grid guest driver then???
If I install the official Debian non-free Nvidia drivers, nvidia-smi can't communicate with the driver anymore. Which is not surprising. But using this driver was my last straw.
Can someone tell me, PLEASE!!!, how to set up Cuda in a VM with a vGPU? I'm about to lose my mind over this...
Thanks!
My card is an RTX A5000. I obtained the NVidia grid host driver (535.129) and installed it on the PVE host. I activated SR-IOV and now have several virtual devices that I can pass through to a VM.
I created a Debian 12 VM and installed the NVidia grid guest driver (535.129) in the VM. nvidia-smi is showing me the virtual card (with Cuda 12.2).
My troubles start when I try to set up Cuda.
I downloaded the official Cuda package (12.2; because it says somewhere that 12.3 is not compatible) from Nvidia. It wants to install a "driver" in addition to the Cuda Toolkit. If I let it install the driver (535.54), it deinstalls my vGPU guest driver and then tells me that itself is not compatible with the card found. So I thought, well, this seems to be a normal driver that I can do without (because I already have the grid guest driver). But if I don't let it install, it complains that the "Cuda driver" was not installed. So apparently, it is a Cuda driver after all. But why does it deinstall my grid guest driver then???
If I install the official Debian non-free Nvidia drivers, nvidia-smi can't communicate with the driver anymore. Which is not surprising. But using this driver was my last straw.
Can someone tell me, PLEASE!!!, how to set up Cuda in a VM with a vGPU? I'm about to lose my mind over this...
Thanks!