How-to: configure Mellanox ConnectX-3 cards for SRIOV and VFS

I just tried installing the OFED drivers. I can't get it to build. I cleaned up all the errors caused by missing packages etc, but now it just says it can't build modules. This seems to be related to the newer kernel tree, but the newer drivers don't support the CX3 cards. I could try overriding that, but if they won't control the card, building the drivers isn't useful.

I did get SR-IOV to work with the in kernel driver on a Threadripper 2950. Just added the module configuration from the OP, and assigned it. The IOMMU gives each VF a group on that machine. The one I would like it to work on, an Epyc, does not. They are all in the same group, which explains the issues I had using passthrough. I'm not sure there is anything I can do about that. The BIOS/UEFI appear to be the latest version available from Supermicro.

I suppose I could buy a CX4 or newer, but I'm not sure it's worth the hassle or if it will behave better with the Epyc system.

Update: This helped get the IOMMU groups sorted out. https://www.supermicro.com/support/faqs/faq.cfm?faq=31883 After adding a couple of settings I was missing, the groups broke up and I was able to use SR-IOV.
 
Last edited:
Updated instructions for more updated everything

* Below be careful, you need specific versions of things, really for tinkerers only
* search install scripts for arugment options and parameters -- search term
* this will build ALL debs

1. Go here and download according to needs https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/
2. Accept EULA and scp file to server
3. Untar tzr -xzvf <filename>`(xtract ze vuckin file)
4. cd to folder ./mlnxofedinstall --add-kernel-support --skip-distro-check
5. This will produce the below output in the DEBSfolder
6. Notice updated versions of things

including mft_4.30.1-1210_amd64.deb (used in above instructions) and additional tooling and OFED drivers


Code:
fwctl-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb              libopensm_5.21.12.MLNX20250617.f74e01b8-0.1.2410325_amd64.deb        mlnx-ofed-kernel-utils_24.10.OFED.24.10.3.2.5.1-1_amd64.deb
ibacm_2410mlnx54-1.2410068_amd64.deb                  libopensm-devel_5.21.12.MLNX20250617.f74e01b8-0.1.2410325_amd64.deb  mlnx-ofed-vma_24.10-3.2.5.0_all.deb
ibarr_0.1.3-1.2410068_amd64.deb                      librdmacm1_2410mlnx54-1.2410068_amd64.deb                   mlnx-ofed-vma-eth_24.10-3.2.5.0_all.deb
ibdump_6.0.0-1.2410068_amd64.deb                  librdmacm1-dbg_2410mlnx54-1.2410068_amd64.deb                   mlnx-ofed-vma-eth-user-only_24.10-3.2.5.0_all.deb
ibsim_0.12-1.2410068_amd64.deb                      librdmacm-dev_2410mlnx54-1.2410068_amd64.deb                   mlnx-ofed-vma-user-only_24.10-3.2.5.0_all.deb
ibsim-doc_0.12-1.2410068_all.deb                  libxpmem0_2.7-0.2310055_amd64.deb                       mlnx-ofed-vma-vpi_24.10-3.2.5.0_all.deb
ibutils2_2.1.1-0.21905.MLNX20250604.g53bdce92c.2410325_amd64.deb  libxpmem-dev_2.7-0.2310055_amd64.deb                       mlnx-ofed-vma-vpi-user-only_24.10-3.2.5.0_all.deb
ibverbs-providers_2410mlnx54-1.2410068_amd64.deb          mft_4.30.1-1210_amd64.deb                           mlnx-ofed-xlio_24.10-3.2.5.0_all.deb
ibverbs-utils_2410mlnx54-1.2410068_amd64.deb              mlnx-ethtool_6.9-1.2410068_amd64.deb                       mlnx-ofed-xlio-user-only_24.10-3.2.5.0_all.deb
infiniband-diags_2410mlnx54-1.2410068_amd64.deb              mlnx-fw-updater_24.10-3.2.5.0_amd64.deb                   mlnx-tools_24.10-0.2410068_amd64.deb
iser-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb              mlnx-iproute2_6.10.0-1.2410325_amd64.deb                   mlx-steering-dump_1.0.0-0.2410068_all.deb
isert-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb              mlnx-nvme-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb               ofed-scripts_24.10.OFED.24.10.3.2.5-1_amd64.deb
kernel-mft-dkms_4.30.1.1210-1_all.deb                  mlnx-ofed-all_24.10-3.2.5.0_all.deb                       opensm_5.21.12.MLNX20250617.f74e01b8-0.1.2410325_amd64.deb
knem_1.1.4.90mlnx3-OFED.23.10.0.2.1.1_amd64.deb              mlnx-ofed-all-exact_24.10-3.2.5.0_all.deb                   opensm-doc_5.21.12.MLNX20250617.f74e01b8-0.1.2410325_amd64.deb
knem-dkms_1.1.4.90mlnx3-OFED.23.10.0.2.1.1_all.deb          mlnx-ofed-all-user-only_24.10-3.2.5.0_all.deb                   Packages
libibmad5_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-basic_24.10-3.2.5.0_all.deb                       Packages.bz2
libibmad5-dbg_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-basic-exact_24.10-3.2.5.0_all.deb                   perftest_24.10.0-0.95.g370212b.2410325_amd64.deb
libibmad-dev_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-basic-user-only_24.10-3.2.5.0_all.deb               python3-pyverbs_2410mlnx54-1.2410068_amd64.deb
libibnetdisc5_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-bluefield_24.10-3.2.5.0_all.deb                   rdmacm-utils_2410mlnx54-1.2410068_amd64.deb
libibnetdisc5-dbg_2410mlnx54-1.2410068_amd64.deb          mlnx-ofed-bluefield-user-only_24.10-3.2.5.0_all.deb               rdma-core_2410mlnx54-1.2410068_amd64.deb
libibnetdisc-dev_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-dpdk_24.10-3.2.5.0_all.deb                       Release
libibumad3_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-dpdk-user-only_24.10-3.2.5.0_all.deb               Release.gpg
libibumad3-dbg_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-eth-only-user-only_24.10-3.2.5.0_all.deb               rshim_2.1.14-0.g0f95837.2410325_amd64.deb
libibumad-dev_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-hpc_24.10-3.2.5.0_all.deb                       sharp_3.9.1.MLNX20250604.25aad3d5-1.2410325_amd64.deb
libibverbs1_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-hpc-user-only_24.10-3.2.5.0_all.deb                   sockperf_3.10-0.git5ebd327da983.2410068_amd64.deb
libibverbs1-dbg_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-kernel-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb           srp-dkms_24.10.OFED.24.10.3.2.5.1-1_all.deb
libibverbs-dev_2410mlnx54-1.2410068_amd64.deb              mlnx-ofed-kernel-only_24.10-3.2.5.0_all.deb                   srptools_2410mlnx54-1.2410068_amd64.deb




You guys want https://doc.dpdk.org/guides/nics/mlx4.html section 38.4.2.1.

This will also give you two separate pci interfaces to passthrough instead of one, very cool.

Always check the data plane project first when dealing with network controllers
This looks promising! I'm in a similar boat - trying to remember what I did years ago and trying to sort out what in the world the naming convention is suppose to be.

Do the changes to /etc/modprobe.d/mlx4_core.conf go away then or can you still issue something like num_vfs=2,2,0 toget additional virtual ports?

Whats the best way to change over?
 
What I've done is use a service to turn on VF's upon startup:

Code:
nano /etc/systemd/system/mlx5_vfs.service

I then pasted the following config:
Code:
[Unit]
Description=Enable SR-IOV and detach guest VFs from host
Requires=network-online.target
After=network-online.target

[Service]
Type=oneshot
RemainAfterExit=yes
# Create NIC VFs
ExecStart=/usr/bin/bash -c 'echo 8 > /sys/class/net/ens7np0/device/sriov_numvfs'
# Set static MACs for VFs
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 0 mac 88:e9:a4:4b:d4:10'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 1 mac 88:e9:a4:4b:d4:11'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 2 mac 88:e9:a4:4b:d4:12'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 3 mac 88:e9:a4:4b:d4:13'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 4 mac 88:e9:a4:4b:d4:14'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 5 mac 88:e9:a4:4b:d4:15'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 6 mac 88:e9:a4:4b:d4:16'
ExecStart=/usr/bin/bash -c '/usr/bin/ip link set ens7np0 vf 7 mac 88:e9:a4:4b:d4:17'
# Detach VFs from host
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.1 > /sys/bus/pci/devices/0000\\:82\\:00.1/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.2 > /sys/bus/pci/devices/0000\\:82\\:00.2/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.3 > /sys/bus/pci/devices/0000\\:82\\:00.3/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.4 > /sys/bus/pci/devices/0000\\:82\\:00.4/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.5 > /sys/bus/pci/devices/0000\\:82\\:00.5/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.6 > /sys/bus/pci/devices/0000\\:82\\:00.6/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:00.7 > /sys/bus/pci/devices/0000\\:82\\:00.7/driver/unbind'
ExecStart=/usr/bin/bash -c 'echo 0000:82:01.0 > /sys/bus/pci/devices/0000\\:82\\:01.0/driver/unbind'

[Install]
WantedBy=multi-user.target

0000:82:00 is my card's address, ens7np0 is the interface name (I'm using a ConnectX-5), and i echo 8 VF's.
Code:
82:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

Then enable and start the service:
Code:
systemctl enable mlx5_vfs.service
systemctl start mlx5_vfs.service
 
The original guide worked for me when my primary (Dell R420) PVE server was on Proxmox 8 (8.3, iirc). I've since upgraded that server to PVE 9 and it's still working, but I've noticed some errant behavior that might be related to this. Since then, I've been beating my head against trying to install the same card in a second server (same Dell R420) with the same ConnectX-3 card in it.

Bash:
root@pve2:~# apt-get install gcc make dkms pve-headers
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
gcc is already the newest version (4:14.2.0-1).
make is already the newest version (4.4.1-2).
dkms is already the newest version (3.2.2-1~deb13u1).
pve-headers is already the newest version (9.1.0).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up kernel-mft-dkms (4.22.1-11) ...
Removing old kernel-mft-dkms-4.22.1 DKMS files...
Deleting module kernel-mft-dkms/4.22.1 completely from the DKMS tree.
Loading new kernel-mft-dkms-4.22.1 DKMS files...
First Installation: checking all kernels...
Building only for 6.17.4-2-pve
Building for architecture amd64
Building initial module for 6.17.4-2-pve

Error! Bad return status for module build on kernel: 6.17.4-2-pve (amd64)
Consult /var/lib/dkms/kernel-mft-dkms/4.22.1/build/make.log for more information.
dpkg: error processing package kernel-mft-dkms (--configure):
 installed kernel-mft-dkms package post-installation script subprocess returned error exit status 10
Errors were encountered while processing:
 kernel-mft-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)

I've been poking at this but I'm not sure what to do from here. I believe it's telling me something isn't compatible with the kernel maybe? The "not fully installed or removed" seems to need resolved, but how do I go about cleaning that up so it can proceed?