Mellanox OFED (MLNX_OFED) Software with pve 7.0-2 and/or 6.4-4

MoreDakka

Active Member
May 2, 2019
58
13
28
45
Hey,

So we have some Mellanox 40g cards in our eventual proxmox 8 node cluster (4 cpu 4 storage).
I'm working with one node to try to get all the stuff running before working on the other 7. I'm getting stuck at the 40g card and the Mellanox drivers.
It seems that development of those drivers stopped at Debian 10.8.

PVE 6.4-4 is Debian 10.9
PVE 7.0-2 is Debian 11

root@pve1-cpu1:~# lspci | grep Mellanox
05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
root@pve1-cpu1:~#

How can I get these working with a PVE version?

Thanks!
 
Hi,

root@pve1-cpu1:~# lspci | grep Mellanox
05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
There's mainline-kernel support for these card family in general via the mlx4 driver, or doesn't that work out?

FWIW, the mlx5 for the 100G card family works out well for us here.
Code:
# lspci |grep -i mellanox
41:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
41:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
81:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
81:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
# lsmod | grep mlx
mlx5_ib               331776  0
ib_uverbs             147456  1 mlx5_ib
ib_core               360448  6 rdma_cm,iw_cm,ib_iser,ib_uverbs,mlx5_ib,ib_cm
mlx5_core            1126400  1 mlx5_ib
mlxfw                  32768  1 mlx5_core
tls                    94208  2 bonding,mlx5_core
pci_hyperv_intf        16384  1 mlx5_core

And the kernel code mentions support for your family in mlx4:
https://git.proxmox.com/?p=mirror_u...t/ethernet/mellanox/mlx4/main.c;hb=HEAD#l4273

Else I figure you mean the following driver:
https://www.mellanox.com/products/ethernet-drivers/linux/mlnx_en

Here it may be good to know that Proxmox VE bases off the Ubuntu kernel, and so it may be more suitable to select that.
For Proxmox VE 6.4 you'd use the 5.4 based kernel of Ubuntu 21.04.
 
Sorry for the delayed response here and I appreciate you helping me out.
I did get proxmox on D11 to recognize the 40g adapters but omg are they slow (we have 8 nodes with dual 40g adapters on separate switches):

# iperf -s 192.168.1.84
iperf: ignoring extra argument -- 192.168.1.84
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.84 port 5001 connected with 192.168.1.83 port 60770
[ ID] Interval Transfer Bandwidth
[ 4] 0.0000-10.0029 sec 8.18 GBytes 7.02 Gbits/sec
[ 5] local 192.168.1.84 port 5001 connected with 192.168.1.83 port 60772
[ ID] Interval Transfer Bandwidth
[ 5] 0.0000-60.0037 sec 51.8 GBytes 7.41 Gbits/sec

Thought I would try multi-thread:

# iperf -c 192.168.1.84 -t 60 -P 4
------------------------------------------------------------
Client connecting to 192.168.1.84, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.91 port 34460 connected with 192.168.1.84 port 5001
[ 5] local 192.168.1.91 port 34462 connected with 192.168.1.84 port 5001
[ 3] local 192.168.1.91 port 34458 connected with 192.168.1.84 port 5001
[ 6] local 192.168.1.91 port 34464 connected with 192.168.1.84 port 5001
[ ID] Interval Transfer Bandwidth
[ 4] 0.0000-60.0002 sec 11.6 GBytes 1.66 Gbits/sec
[ 5] 0.0000-60.0101 sec 7.58 GBytes 1.08 Gbits/sec
[ 6] 0.0000-60.0006 sec 11.3 GBytes 1.61 Gbits/sec
[ 3] 0.0000-60.0004 sec 11.7 GBytes 1.68 Gbits/sec
[SUM] 0.0000-60.0005 sec 42.2 GBytes 6.04 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) = 0.912/257.390/1026.660/512.857 ms (tot/err) = 4/0

Just for the fun of it, to make sure there wasn't a weird thing with the OS I tested the 10g adapter (each node has dual gige, dual 10g and dual 40g:

# iperf -s 10.10.1.84
iperf: ignoring extra argument -- 10.10.1.84
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (default)
------------------------------------------------------------
[ 4] local 10.10.1.84 port 5001 connected with 10.10.1.83 port 40006
[ ID] Interval Transfer Bandwidth
[ 4] 0.0000-60.0027 sec 65.8 GBytes 9.41 Gbits/sec

40g nics:
*-network
description: interface
product: MT27500 Family [ConnectX-3]
vendor: Mellanox Technologies

10g nics:
*-network:1
description: Ethernet interface
product: 82599ES 10-Gigabit SFI/SFP+ Network Connection

I tried to change the 40g to eth mode but wasn't accepting any IP addresses. (echo eth > /sys/bus/pci/devices/0000\:06\:00.0/mlx4_port1)
I'm guessing we need to have the MLNX_OFED to change to eth mode properly might help. Same with proper drivers for D11.

I dunno, grasping at straws to get this CEPH network running at 40g speeds. Do I need to downgrade to debian 10.8 to get the drivers? A manual install of proxmox 7?

Thanks!
 
Last edited:
  • Like
Reactions: zeuxprox
Any ideas if installing proxmox on 10.8 would be the best course of action for getting CEPH working at 40g instead of 7gbs?
7 has a bunch of nice features that I would like to keep using, will it work with 10.8?
 
  • Like
Reactions: zeuxprox
We had the same problem occur when upgrading to 6.4 with different host machines, all using 40GB Mellanox cards. Ceph was constantly losing OSDs, VMs were not responding, etc.

We got it working by changing the Infiniband-mode from connected mode to datagram mode for the VM machines (because we cant update the cards in those old sun servers). For the Ceph machines we changed the cards to 56GB cards which are working fine.

Haven´t moved to 7 yet though.
 
What's your network setup? Are you running in ethernet or infiniband mode? What's the hardware (exact card models & switches) and what firmware versions do you run?

I've had similar issues when switched a network from QDR infiniband (which was working just fine) to 40GE which initially showed really poor performance.
 
Any luck with installing Mellanox OFED on Proxmox 7?

We have many ConnectX-4 25Gb cards that perform poorly on Proxmox 7...
 
We have many ConnectX-4 25Gb cards that perform poorly on Proxmox 7...
With Mellanox cards, I would always check if there are new firmware versions available. https://www.mellanox.com/support/firmware/mlxup-mft

If you choose Linux -> x86 you get the link directly to the binary. After verifying the checksum, you will have to make it executable. It should detect the cards automatically and, if you let it, download and install the new firmware versions. A reboot is needed afterwards.
 
With Mellanox cards, I would always check if there are new firmware versions available. https://www.mellanox.com/support/firmware/mlxup-mft

If you choose Linux -> x86 you get the link directly to the binary. After verifying the checksum, you will have to make it executable. It should detect the cards automatically and, if you let it, download and install the new firmware versions. A reboot is needed afterwards.
Thank you for the quick reply!

I'll check the firmware, but would also like to know if it's currently possible to install OFED drivers on Proxmox 7. Would OFED for Ubuntu 21 work?
 
If you're using the cards just for regular ethernet stuff - in tree drivers generally work ok. OFED seems to be too distribution/version specific and maintaining it during os upgrades could be quite a pita. I always prefer the in-tree drivers for the simplicity, unless I need some fancy feature.

@ectoplasmosis check what mellanox requires for your specific case / cards. In my case (again, really poor performance with 40GE) I didn't have flow control enabled on the switches. Just by enabling it it solved all performance issues i've had.
 
If you're using the cards just for regular ethernet stuff - in tree drivers generally work ok. OFED seems to be too distribution/version specific and maintaining it during os upgrades could be quite a pita. I always prefer the in-tree drivers for the simplicity, unless I need some fancy feature.

@ectoplasmosis check what mellanox requires for your specific case / cards. In my case (again, really poor performance with 40GE) I didn't have flow control enabled on the switches. Just by enabling it it solved all performance issues i've had.
Thanks for the tips.

We make heavy use of SR-IOV, which seems to work better when using current OFED drivers.

With regards to performance, pause-frame based flow control and PFC is disabled across our network, so we are keen to experiment with drivers.
 
What I've found out is these cards, or at least the 40GE CX-3Pro require flow control, no matter which drivers you use.

Regarding OFED - in my experience MOFED works if it has support for the running kernel version. You have to use the --skip-distro-check (or something to this effect) but it does compile against PVE, which OS-wise is debian with ubuntu kernel :)
 
For me, I just used the drivers build into the OS. However I had to change the card to Eth mode instead. Plus tweaked the mtu to 65520 for all the NICs. Got only 1/2 the speed of our 40gb nic thanks to overhead of ethernet but it's better than 7Gb/s in ib.
 
  • Like
Reactions: gseeley

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!