Issue with nvidia driver install for passthrough

Jules Peeters

New Member
Apr 6, 2022
5
2
3
Hi,

I've been trying to set up passthrough, i built a new server and had it working before but its been a while.
I get stuck in the installation of the downloaded nvidia driver i keep getting "ERROR: Unable to load the kernel module 'nvidia.ko'".

If anyone could give me an idea what i should do that'd be greatly appreciated!​


I've tried different versions of the nvidia installen and all resulted the same way.

The card is a quadro p400.
The rest of the hardware is a 3700x with an asrock x470d4u motherboard.

my pve version is: pve-manager/7.1-12/b3c09de3 (running kernel: 5.13.19-2-pve)

the kernel folder:
root@pve:/usr/src# ls
linux-headers-5.13.19-6-pve

This is the command i've been running:​

./NVIDIA-Linux-x86_64-495.44.run --kernel-source-path /usr/src/linux-headers-5.13.19-6-pve/

I confirmed that there are no kernel drivers in use:


2b:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
Subsystem: Dell GP107GL [Quadro P400]
Kernel modules: nvidiafb, nouveau


But still the installation errors out with:


ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to
build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA device(s), or no NVIDIA device installed in this system is supported by this
NVIDIA Linux graphics driver release.

Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.



This is the last bit of the installer.log:


-> done.
-> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build t>

Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.
-> Kernel module load error: Exec format error
-> Kernel messages:
[ 13.251546] igb 0000:23:00.0 enp35s0: igb: enp35s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 13.254410] scsi 12:0:0:0: Direct-Access AMI Virtual Floppy0 1.00 PQ: 0 ANSI: 0 CCS
[ 13.254597] sd 12:0:0:0: Attached scsi generic sg2 type 0
[ 13.299406] sd 12:0:0:0: [sdb] Attached SCSI removable disk
[ 13.359193] vmbr0: port 1(enp35s0) entered blocking state
[ 13.359198] vmbr0: port 1(enp35s0) entered forwarding state
[ 13.359455] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
[ 14.274799] scsi 13:0:0:0: Direct-Access AMI Virtual HDisk0 1.00 PQ: 0 ANSI: 0 CCS
[ 14.274958] sd 13:0:0:0: Attached scsi generic sg3 type 0
[ 14.357962] sd 13:0:0:0: [sdc] Attached SCSI removable disk
[ 179.128839] systemd[1]: systemd 247.3-7 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hie>
[ 179.145093] systemd[1]: Detected architecture x86-64.
[ 179.336871] systemd-journald[550]: Received SIGTERM from PID 1 (systemd).
[ 179.336946] systemd[1]: Stopping Journal Service...
[ 179.345427] systemd[1]: systemd-journald.service: Succeeded.
[ 179.345561] systemd[1]: Stopped Journal Service.
[ 179.346781] systemd[1]: Starting Journal Service...
[ 179.361831] systemd[1]: Started Journal Service.
[ 179.365471] systemd-journald[3706]: Received client request to flush runtime journal.
[ 213.700877] kauditd_printk_skb: 8 callbacks suppressed
[ 213.700880] audit: type=1400 audit(1649280885.411:20): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/sbin/chronyd" pid=14159 comm="apparmor_parser"
[ 456.902103] VFIO - User Level meta-driver version: 0.3
[ 456.917023] nvidia: disagrees about version of symbol module_layout
[ 708.261361] nvidia: disagrees about version of symbol module_layout
[ 1418.470654] nvidia: disagrees about version of symbol module_layout
 
I have no experience with this, but: you are running kernel 5.13.19-2-pve and building with linux-headers-5.13.19-6-pve. Maybe the module will load when you (install and) run the kernel version that the module was build for?
 
Why do you have drivers and want to do passthrough? Those two are mutually exclusive. If you just want to passthrough hardware, just do it. It does not require any drivers on the PVE side, the hardware will be disconnected anyhow.
 
Why do you have drivers and want to do passthrough? Those two are mutually exclusive. If you just want to passthrough hardware, just do it. It does not require any drivers on the PVE side, the hardware will be disconnected anyhow.
To my understanding most guides require to install the nvidia drivers on the host to have it work. Im trying to get it to work inside a lxc container.
 
I have no experience with this, but: you are running kernel 5.13.19-2-pve and building with linux-headers-5.13.19-6-pve. Maybe the module will load when you (install and) run the kernel version that the module was build for?
This is the version it pulled in itself when running apt install pveheaders. Ill see if i can pull that specific version
 
This is the version it pulled in itself when running apt install pveheaders. Ill see if i can pull that specific version

Instead of downgrading the headers, you should investigate why your kernel is not on that same (5.13.19-6-pve) version. (Assuming your host is up-to-date as of now.)
Maybe your host simply needs a reboot?
What is the output of apt list --installed | grep pve-kernel?
 
Instead of downgrading the headers, you should investigate why your kernel is not on that same (5.13.19-6-pve) version. (Assuming your host is up-to-date as of now.)
Maybe your host simply needs a reboot?
What is the output of apt list --installed | grep pve-kernel?


pve-kernel-5.13.19-2-pve/stable,now 5.13.19-4 amd64 [installed]
pve-kernel-5.13.19-6-pve/stable,now 5.13.19-15 amd64 [installed,automatic]
pve-kernel-5.13/stable,now 7.1-9 all [installed]
pve-kernel-helper/stable,now 7.1-14 all [installed]
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!