Hi All - wondering if anyone has got this particular combination to work? At face value mine looks like it should work, but I get kernel panics after a few minutes of vGPU usage and dont know how to troubleshoot it.
I have the Nvidia vGPU host drivers installed on the PVE host - no issues with installation (v535.261.03). nvidia-smi shows the GPU listed
I didnt do any modifications with GPU unlock - from what I read the Tesla P4 is supposed to be unlocked already.
I have the Nvidia GRID drivers installed on the VM - and have passed through one of the default mdev profiles that came with the drivers (nvidia-64)
nvidia-smi on the host shows the GPU there - and when I run cuda apps (built from CUDA 12.2.2, which is supposed to align with this driver version) and docker containers using nvidia container toolkit it seems to work - temporarily at least.
but then at some stage it crashes the whole PVE server with a kernel panic - needs a hard reset to come back online.
any suggestions where to start?
I have the Nvidia vGPU host drivers installed on the PVE host - no issues with installation (v535.261.03). nvidia-smi shows the GPU listed
I didnt do any modifications with GPU unlock - from what I read the Tesla P4 is supposed to be unlocked already.
I have the Nvidia GRID drivers installed on the VM - and have passed through one of the default mdev profiles that came with the drivers (nvidia-64)
nvidia-smi on the host shows the GPU there - and when I run cuda apps (built from CUDA 12.2.2, which is supposed to align with this driver version) and docker containers using nvidia container toolkit it seems to work - temporarily at least.
but then at some stage it crashes the whole PVE server with a kernel panic - needs a hard reset to come back online.
any suggestions where to start?