Hello everyone,
I'm trying to install NVIDIA vGPU drivers for a Tesla M60 on Proxmox 8.4.1.
I've followed the Polloloco guide and obtained the drivers from the NVIDIA hub (NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm.run), applying the patch found on this post.
No matter what I try, I consistently encounter an error with nvidia.ko.
I've hit a wall and could really use some assistance. Any help or insights would be greatly appreciated!
Edit 1:
I try with https://wvthoog.nl/proxmox-vgpu-v3/ but same result nividia.ko with kernel 6.5 pinned
Edit 2:
I try
I'm trying to install NVIDIA vGPU drivers for a Tesla M60 on Proxmox 8.4.1.
I've followed the Polloloco guide and obtained the drivers from the NVIDIA hub (NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm.run), applying the patch found on this post.
No matter what I try, I consistently encounter an error with nvidia.ko.
lsmod | grep -E "nouveau|rivafb|nvidiafb|rivatv"
returns nothing, confirming these modules are not loaded.I've hit a wall and could really use some assistance. Any help or insights would be greatly appreciated!
Code:
# dmesg | grep -i iommu
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-11-pve root=/dev/mapper/pve-root ro quiet video=efifb:off intel_iommu=on iommu=pt pci-stub.ids=10de:13f2
[ 0.457326] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-11-pve root=/dev/mapper/pve-root ro quiet video=efifb:off intel_iommu=on iommu=pt pci-stub.ids=10de:13f2
[ 0.457399] DMAR: IOMMU enabled
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
Code:
# uname -r
6.8.12-11-pve
Code:
# dpkg -l | grep proxmox-headers-$(uname -r | sed 's/-pve//')
ii proxmox-headers-6.8.12-11-pve 6.8.12-11 amd64 Proxmox Kernel Headers
Code:
# mokutil --sb-state
SecureBoot disabled
Code:
# cat /etc/modules-load.d/modules.conf
# Modules required for PCI passthrough
vfio
vfio_iommu_type1
vfio_pci
pci-stub
Code:
# cat /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
Code:
# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:13f2,10de:13f2
Code:
# lspci -nnk | grep -i -E "nvidia|vga|3d|pci-stub"
0b:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. G200eR2 [102b:0534] (rev 01)
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
Subsystem: NVIDIA Corporation GM204GL [Tesla M60] [10de:115e]
Kernel modules: nvidiafb, nouveau
85:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
Subsystem: NVIDIA Corporation GM204GL [Tesla M60] [10de:115e]
Kernel modules: nvidiafb, nouveau
Code:
# ./NVIDIA-Linux-x86_64-535.161.05-vgpu-kvm-custom.run --dkms -m=kernel --no-drm
Code:
-> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA device(s), or no NVIDIA device installed in this system is supported by this NVIDIA Linux graphics driver release.
Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.
-> Kernel module load error: No such device
-> Kernel messages:
[ 189.016041] vmbr0: port 19(veth1009i0) entered forwarding state
[ 307.774150] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 308.051389] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 308.051408] NVRM: The NVIDIA probe routine was not called for 2 device(s).
[ 308.055165] NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
[ 308.055169] NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
[ 308.055171] NVRM: No NVIDIA devices probed.
[ 308.055509] nvidia-nvlink: Unregistered Nvlink Core, major device number 235
[ 2014.294624] perf: interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[ 2598.516202] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 2598.516222] NVRM: The NVIDIA probe routine was not called for 2 device(s).
[ 2598.520101] NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
[ 2598.520105] NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
[ 2598.520108] NVRM: No NVIDIA devices probed.
[ 2598.520543] nvidia-nvlink: Unregistered Nvlink Core, major device number 235
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
Edit 1:
I try with https://wvthoog.nl/proxmox-vgpu-v3/ but same result nividia.ko with kernel 6.5 pinned
Edit 2:
I try
Code:
#pve-nvidia-vgpu-helper setup
You are running the Proxmox kernel 6.8.12-11, searching the associated and newer kernel headers package.
All required packages are already installed.
All done, you can continue with the NVIDIA vGPU driver installation.
Code:
-> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA device(s), or no NVIDIA device installed in this system is supported by this NVIDIA Linux graphics driver release.
Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.
-> Kernel module load error: No such device
-> Kernel messages:
[ 2329.168270] perf: interrupt took too long (3168 > 3143), lowering kernel.perf_event_max_sample_rate to 63000
[ 3178.509846] perf: interrupt took too long (3968 > 3960), lowering kernel.perf_event_max_sample_rate to 50000
[ 4521.853629] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 4522.132360] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 4522.132379] NVRM: The NVIDIA probe routine was not called for 2 device(s).
[ 4522.135882] NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
[ 4522.135885] NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
[ 4522.135887] NVRM: No NVIDIA devices probed.
[ 4522.136289] nvidia-nvlink: Unregistered Nvlink Core, major device number 235
[ 4707.208477] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 4707.208495] NVRM: The NVIDIA probe routine was not called for 2 device(s).
[ 4707.211751] NVRM: This can occur when a driver such as:
NVRM: nouveau, rivafb, nvidiafb or rivatv
NVRM: was loaded and obtained ownership of the NVIDIA device(s).
[ 4707.211754] NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
[ 4707.211757] NVRM: No NVIDIA devices probed.
[ 4707.212082] nvidia-nvlink: Unregistered Nvlink Core, major device number 235
Last edited: