Installing doca-ofed on Proxmox 9.0

kevindd992002

Member
Dec 20, 2023
59
2
8
So I'm trying to make SR-IOV on my CX4-LX card to work on my new PVE 9.0 install. Based on this official Nvidia article, I need to make sure that the ofed drivers are installed in both the host and guest.

So I installed doca-ofed by following another official Nvidia article. Here are the things I've ran before the reboot:

https://gist.github.com/kevindd992002/b030faffb74ffb34d333552cab65e19a

From what I see, everything succeeded. After rebooting though, I get a dkms.service cannot start error message. Here are some logs for the service:

https://gist.github.com/kevindd992002/928db5c0ac937f565a6035a885ff0752

The weird thing here is that I can create the VF's without any issues. Do you guys have any ideas on how to fix this?
 
dkms is bombing out because you have two modules targeting the same thing

Code:
ernel-mft-dkms/4.33.0.169, 6.14.11-3-pve, amd64: installed
kernel-mft-dkms/4.33.0.169, 6.14.11-3-pve, x86_64: built
knem/1.1.4.90mlnx3, 6.14.11-3-pve, amd64: installed
knem/1.1.4.90mlnx3, 6.14.11-3-pve, x86_64: built

Code:
Oct 03 03:19:07 pve dkms[2201]: Module /lib/modules/6.14.11-3-pve/updates/dkms/mst_pci.ko already installed (unversioned module), override by specifying --force
Oct 03 03:19:08 pve dkms[2201]: Module /lib/modules/6.14.11-3-pve/updates/dkms/mst_pciconf.ko already installed (unversioned module), override by specifying --force
Oct 03 03:19:08 pve dkms[2319]: Module /lib/modules/6.14.11-3-pve/updates/dkms/knem.ko already installed at version 1.1.4.90mlnx3, override by specifying --force
 
Why though? I haven't done anything yet aside from doing `apt install -y doca-ofed` and `apt install -y mlnx-fw-updater` which are steps in the article.

If all I want is Ethernet, is it better to just use the in-tree mlx5 driver in Proxmox?
 
Last edited:
It's tough to say honestly, likely a hodgepodge mess in the nvidia libraries or you have apt config mishap with amd64 and x86_64 both being requested and somehow being fulfilled. You can fix it by removing a set from dkms purview
 
Why though? I haven't done anything yet aside from doing `apt install -y doca-ofed` and `apt install -y mlnx-fw-updater` which are steps in the article.

If all I want is Ethernet, is it better to just use the in-tree mlx5 driver in Proxmox?
DOCA 3.1 does not support debian trixie, I had all sorts of issues so I reverted to proxmox 8.4 and waiting for doca 3.2, that's expected to support it. Do not use the ubuntu drivers , the libraries are not fully supported