vGPU Tesla P4 wrong mdevctl gpu

Thank you all for figuring this out. I've too much time wondering why.

To summarize the steps that worked for me;

Install Option 4: 16.1 (535.104.06)
verify nvidi-smi works
verify mdevctl types outputs P40 profiles
reboot

Upgrade Option 1: 17.0 (550.54.10)
verify nvidia-smi works
verify mdevctl types outputs nothing
reboot

Download 16.4 original driver (not patched driver) and extract (find the dl link in the script)
./NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm.run -x
cp NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm/vgpuConfig.xml /usr/share/nvidia/vgpu/vgpuConfig.xml
reboot

Proxmox end result
NVIDIA-SMI 550.54.10
mdevctl types has GRID P4 profiles
P4 profiles show up in proxmox

VM end result
NVIDIA-SMI 550.54.14
GRID P4-8Q


I'm guessing the copying 16.1 profiles would work, too but I didn't test that.

Simply installing 17.0 and copying the 16.4 vgpuConfig.xml did not work, so there is something somewhere leftover from the 16.1 install.
 
  • Like
Reactions: kobemtl
Thank you all for figuring this out. I've too much time wondering why.

To summarize the steps that worked for me;

Install Option 4: 16.1 (535.104.06)
verify nvidi-smi works
verify mdevctl types outputs P40 profiles
reboot

Upgrade Option 1: 17.0 (550.54.10)
verify nvidia-smi works
verify mdevctl types outputs nothing
reboot

Download 16.4 original driver (not patched driver) and extract (find the dl link in the script)
./NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm.run -x
cp NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm/vgpuConfig.xml /usr/share/nvidia/vgpu/vgpuConfig.xml
reboot

Proxmox end result
NVIDIA-SMI 550.54.10
mdevctl types has GRID P4 profiles
P4 profiles show up in proxmox

VM end result
NVIDIA-SMI 550.54.14
GRID P4-8Q


I'm guessing the copying 16.1 profiles would work, too but I didn't test that.

Simply installing 17.0 and copying the 16.4 vgpuConfig.xml did not work, so there is something somewhere leftover from the 16.1 install.
God damn it I tried all different drivers patched non patched nothing works. until I found this and following your steps. Finally!!!! Thank you so much.

BTW, I installed 3-4 Tesla P4 and this is the first time I had this issue. Nvidia just #%$@*&(&!@@!