I have issues with Tesla P40 with Proxmox 5.3 with vGPU support.
root@vgpu:/# lspci -d 10de: -k
86:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
Subsystem: NVIDIA Corporation GP102GL [Tesla P40]
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_vgpu_vfio, nvidia
I have installed gpu manager driver: NVIDIA-Linux-x86_64-410.91-vgpu-kvm
I have disabled ECC memory with nvidia-smi as nvidia documentation says that.
I have enabled intel_iommu=on.
When I am trying to start VM then I am getting an error:
kvm: -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:86:00.0/00000000-0000-0000-0000-000000000100,id=hostpci0,bus=pci.0,addr=0x10: vfio error: 00000000-0000-0000-0000-000000000100: error getting device from group 79: Connection timed out
Verify all devices in group 79 are bound to vfio-<bus> or pci-stub and not already in use
dmesg:
[ 150.834555] iommu: Adding device 00000000-0000-0000-0000-000000000100 to group 79
[ 150.834557] vfio_mdev 00000000-0000-0000-0000-000000000100: MDEV: group_id = 79
[ 161.498679] [nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000000100: start failed. status: 0x65 Timeout Occured
Please help me
root@vgpu:/# lspci -d 10de: -k
86:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
Subsystem: NVIDIA Corporation GP102GL [Tesla P40]
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_vgpu_vfio, nvidia
I have installed gpu manager driver: NVIDIA-Linux-x86_64-410.91-vgpu-kvm
I have disabled ECC memory with nvidia-smi as nvidia documentation says that.
I have enabled intel_iommu=on.
When I am trying to start VM then I am getting an error:
kvm: -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:86:00.0/00000000-0000-0000-0000-000000000100,id=hostpci0,bus=pci.0,addr=0x10: vfio error: 00000000-0000-0000-0000-000000000100: error getting device from group 79: Connection timed out
Verify all devices in group 79 are bound to vfio-<bus> or pci-stub and not already in use
dmesg:
[ 150.834555] iommu: Adding device 00000000-0000-0000-0000-000000000100 to group 79
[ 150.834557] vfio_mdev 00000000-0000-0000-0000-000000000100: MDEV: group_id = 79
[ 161.498679] [nvidia-vgpu-vfio] 00000000-0000-0000-0000-000000000100: start failed. status: 0x65 Timeout Occured
Please help me