40gbs Mellanox Infinityband

Proper

Member
Jun 9, 2016
11
0
6
34
I did my best to find similar issues and but I do not see anything that helped me solve this issue.

I am trying to setup 40gbs Mellanox connections directly to Storage server, I purchase few of those and will provide model number below.
After doing allot of research I got the device working on FreeNass side and now I need to do the same for proxmox.

Devices I am using
MCX354A-FCBT MELLANOX CONNECTX-3 INFINIBAND 40GBE/56GBE DUAL QSFP+FDR CX354A

Proxmox Version 6.0-4

Proxmox shows the network adapter in the list but it remains inactive "Active shows: No"

I created a bridge in PX and have the device physically connected to Freenas and it remain inactive.

I do not have allot of experience with infinityband, so i am thinking I am missing something, on Freenass these devices need to be initiated as Ethernet

Another option is wiring, I am using QSFP+ to QSFP+ DAC, Passive, 1-meter, AWG30

I will update this threat with solution if I find it, help is very much apreciated
 

Proper

Member
Jun 9, 2016
11
0
6
34
I have been digging further in to this one and found that although device is detected some supporting services are not working properly

Running following command on host shows the device

lspci | grep Mellanox
31:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

I then run command to start services

mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI modulemodprobe: FATAL: Module mst_pci not found in directory /lib/modules/5.0.15-1-pve
- Failure: 1
Loading MST PCI configuration modulemodprobe: FATAL: Module mst_pciconf not found in directory /lib/modules/5.0.15-1-pve
- Failure: 1
Create devices

mst_pci driver not found
Unloading MST PCI module (unused)modprobe: FATAL: Module mst_pci not found.
- Failure: 1
Unloading MST PCI configuration module (unused)modprobe: FATAL: Module mst_pciconf not found.
- Failure: 1



So it seems drivers need to be installed
Here is a guide for installing Mellanox drivers
https://community.mellanox.com/s/article/how-to-install-mellanox-ofed-on-linux--rev-4-4-2-0-7-0-x

Drivers are available for Debian 9.6 on the website - https://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers&ssn=evjop174k1jlkgdbjg3midk001

Installation fails from the start saying driver is incompatible, with OS version. In my case i was testing on PX 6


So at this point i am unable to install drivers, as i think PX 6 is running Debian 10

This means i can not start mst service and without it interface can not be initialized

If anyone can offer advice - will be appreciated and i will keep updating this thread as I work on this further.
 

Proper

Member
Jun 9, 2016
11
0
6
34
Seems like there is no driver available for Debian 10 ....

I was able to install driver for Ubuntu on VM and will do PCI pass though to see if cards work

If anyone has ideas on how to make these work please share
 

Proper

Member
Jun 9, 2016
11
0
6
34
I am a bit discouraged that there is no response of any kind... I hope my post will help someone else get this up and running..

I was able to get things working and will share steps below.

Although there is no driver for Debian 10, PX v6 seems to have modules needed to run these Mellanox devices, so you do not need official driver to make this work.

After you connect the device, it should be detected and show up on the list of network interfaces.
- If you do not see it on the list, try moving the device to another PCIe port, in some cases port configurations cause issues for these.

Although device has been detected by the host os, controller itself by default is running in "Infinityband mode" and not "Ethernet mode", we can change that using following command

connectx_port_config
It will produce following output:

Code:
ConnectX PCI devices :
|----------------------------|
| 1             0000:XX:00.0 |
|----------------------------|

Before port change:
Ib

|----------------------------|
| Possible port modes:       |
| 1: Infiniband              |
| 2: Ethernet                |
| 3: AutoSense               |
|----------------------------|
Select mode for port 1 (1,2,3):


It is showing that device 1 in this case is working in "Ib" mode and we can switch is to "Ethernet" mode by selection option 2.

One line version of same command is below, make sure to replace PCIe address with your device address:

Code:
echo eth >/sys/bus/pci/devices/0000:XX:00.0/mlx4_port1

Although interface is now properly configured, proxmox will not show it as active in the list of network interfaces, you will need to reboot networking, to do so run:

Code:
/etc/init.d/networking stop && /etc/init.d/networking start

You should now see Mellanod interface as active.

Next issue is that this change will not persist though reboot and configuration files changes I tried have no effect probably because driver process does not run - likely due to driver package not being installed.

So your network interface will default back to "Ib" on reboot and even if you change it you then have to restart networking


For now I created @reboot crontab that reconfigured the interface and then reboots then networking.

Code:
@reboot echo eth >/sys/bus/pci/devices/0000:XX:00.0/mlx4_port1
@reboot /etc/init.d/networking stop && /etc/init.d/networking start


If someone has better way to do this please post, I would love for controller config to be set before network interfaces are started.

I think it is possible to change firmware on the device to keep it in "eth" mode and I may look in to that.
 

jw6677

Member
Oct 19, 2019
41
2
8
29
I just wanted to chime in and say thank you.

I see you still never had anyone to reach out to help or confirm your thread, but this solution worked for me with a connectX-2 as well.

Spent a lot of time before I came across this thread, but it is exactly what I needed.


The only item I was hungup on in this thread was switching to ethernet from ib is indeed what I needed to do, even though my instinct was "But I am trying to setup Infiniband, that can't be right for me", but it was.

JW
 

iconvergence

Active Member
Aug 21, 2013
38
0
26
Not sure if you still need anyhelp but i have 3 ceph cluster on IB with connect ib or x2 or x3 and many different switch fabric (mellanox,voltaire,ibm blade) and CX4 & QSFP interface.

Seems you need dig more on ofed and look for compile it yourselft for fix you issue for buster as there is no version ready. If need i can post help here how to do it.
Once you done it, it will fix most or all of your issue.

Regarding your boot issue, i don't have it but you can add it on your interface post on for example.

Also you seems not have an opensm running on your network.
 

SourCheeks

New Member
Sep 25, 2019
13
1
3
31
I run these same network cards in my pve cluster. I actually have 4 of them (MCX354A-FCBT). You should download the mellanox firmware tool instead of the drivers from their website, then you can permanently configure the card to set it to Ethernet only.

Code:
#https://www.mellanox.com/page/management_tools
#http://www.mellanox.com/downloads/MFT/mft-4.13.0-104-x86_64-deb.tgz

#starts mst service
mst start

#gets device ID to use in next step
mst status

#change both ports to Ethernet only, use device ID from above
mlxconfig -d /dev/mst/mt4099_pci_cr0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2

I got this information from another forum but not sure if it's against the rules to post links to other forums, so PM if you want.
 
Last edited:
  • Like
Reactions: DrillSgtErnst

jw6677

Member
Oct 19, 2019
41
2
8
29
I run these same network cards in my pve cluster. I actually have 4 of them (MCX354A-FCBT). You should download the mellanox firmware tool instead of the drivers from their website, then you can permanently configure the card to set it to Ethernet only.

Code:
#https://www.mellanox.com/page/management_tools
#http://www.mellanox.com/downloads/MFT/mft-4.13.0-104-x86_64-deb.tgz

#starts mst service
mst start

#gets device ID to use in next step
mst status

#change both ports to Ethernet only, use device ID from above
mlxconfig -d /dev/mst/mt4099_pci_cr0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2

I got this information from another forum but not sure if it's against the rules to post links to other forums, so PM if you want.

Wow, this actually went sooo smoothly. I am not sure why I was struggling soo badly.

So, I actually have no ethernet ports, only infiniband, on my cards. As such, I assume switching to ethernet mode is not required between two of these cards (on seperate physical machines, right?)

In fact, ethernet mode is probably problematic?


EDIT:

Scratch that, back down the rabbit hole. I think my device is soo old it's unsupported?

root@server:/etc# mlxconfig -d /dev/mst/mt26428_pci_cr1 query
-E- Unsupported device

:'(
 
Last edited:

SourCheeks

New Member
Sep 25, 2019
13
1
3
31
If your card is infiniband only then this won't work, but you may be able to flash different firmware on your device. I'm not an expert on that though. The command to change ports to ethernet mode works if you have a MCX354A-FCBT.
 

jw6677

Member
Oct 19, 2019
41
2
8
29
If your card is infiniband only then this won't work, but you may be able to flash different firmware on your device. I'm not an expert on that though. The command to change ports to ethernet mode works if you have a MCX354A-FCBT.

Thanks SourCheeks, Yeah, Infiniband only.

Seems you need dig more on ofed and look for compile it yourselft for fix you issue for buster as there is no version ready. If need i can post help here how to do it.
Once you done it, it will fix most or all of your issue.

Also you seems not have an opensm running on your network.
I'll work against this, see if I can figure it out. Could use some more specific guidance though, for sure.

OpenSM I thought was built into the cards, that's my mistake (still learning). I had been trying to connect two machines directly to each other, but get the impression that maybe I need to fork out for an infiniband switch?

I have three machines, two (old-ish) servers and a desktop machine I was hoping to include. (I.e. Desktop machine that is old, but still very useful, lots of old crappy ones that I am just letting collect dust for now.)
The best server has two of the connectx-2 cards in it, each with a single infiniband port, the other two machines have a single card each of the same type.

Admittedly, proxmox, and networking, are two areas that I hope to learn better through this process, so I am fairly novice when it comes to actual data center type solutions like infiniband.


Will dig into ofed, and any help to my approach on this, It would be much appreciated.
 
Last edited:

jw6677

Member
Oct 19, 2019
41
2
8
29
Ok, So, I managed to get ibstatus to report the link as being up between two devices!



Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000c:4991
base lid: 0x3
sm lid: 0x3
state: 2: INIT
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand

Infiniband device 'mlx4_1' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000c:4585
base lid: 0x0
sm lid: 0x0
state: 2: INIT
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand



Which is probably the most exciting thing to happen all week.

However, like the "rogue man on a mission" that I am, I have a goal, but no map.
I don't actually know what I am meant to do from here to ensure these proxmox devices communicate with each other over these connections.

I see the devices in proxmox's networks section, but I am unclear as to the setup.

Here's my network map, ignoring all ethernet stuff (Wow, that looks funny in a thumbnail)

1573166824397.png
 

jw6677

Member
Oct 19, 2019
41
2
8
29
Well, since I've accidentally turned this into a work log, I figure I'll keep going.

opensm --create-config /etc/opensm/opensm.conf
created me a config file to work against.

Inside, I set the GUID to match the port GUID I got from ibstatus

started `opensm`
and boom!

Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000c:4991
base lid: 0x3
sm lid: 0x3
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand

So, making progress, though I clearly do not know what I am doing, so that's cool.

got both devices working by making an extra config file (following this: https://lists.openfabrics.org/pipermail/ewg/2015-February/018341.html. If you're reading this in future and it's gone, check out the way back machine)

Now I need to figure out how to get proxmox to recognize the connections...
 
Last edited:

Proper

Member
Jun 9, 2016
11
0
6
34
I am surprised to see allot more discussion here.

Approach I posted back in Aug worked fine for me, with simple ctontab on boot on any machine that has the device gets it to present like any other network adapter.

I used this for direct connection to storage server for a few months, it worked without any issues after setup was complete.

Note of caution - if you want a switch you have to make sure you buy managed one as some of them obly work in infinityband and do not work in eth mode.
 

hsv

New Member
Dec 26, 2019
3
0
1
52
Well, since I've accidentally turned this into a work log, I figure I'll keep going.

opensm --create-config /etc/opensm/opensm.conf
created me a config file to work against.

Inside, I set the GUID to match the port GUID I got from ibstatus

started `opensm`
and boom!

Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000c:4991
base lid: 0x3
sm lid: 0x3
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand

So, making progress, though I clearly do not know what I am doing, so that's cool.

got both devices working by making an extra config file (following this: https://lists.openfabrics.org/pipermail/ewg/2015-February/018341.html. If you're reading this in future and it's gone, check out the way back machine)

Now I need to figure out how to get proxmox to recognize the connections...
I have been digging further in to this one and found that although device is detected some supporting services are not working properly

Running following command on host shows the device

lspci | grep Mellanox
31:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

I then run command to start services

mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI modulemodprobe: FATAL: Module mst_pci not found in directory /lib/modules/5.0.15-1-pve
- Failure: 1
Loading MST PCI configuration modulemodprobe: FATAL: Module mst_pciconf not found in directory /lib/modules/5.0.15-1-pve
- Failure: 1
Create devices

mst_pci driver not found
Unloading MST PCI module (unused)modprobe: FATAL: Module mst_pci not found.
- Failure: 1
Unloading MST PCI configuration module (unused)modprobe: FATAL: Module mst_pciconf not found.
- Failure: 1



So it seems drivers need to be installed
Here is a guide for installing Mellanox drivers
https://community.mellanox.com/s/article/how-to-install-mellanox-ofed-on-linux--rev-4-4-2-0-7-0-x

Drivers are available for Debian 9.6 on the website - https://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers&ssn=evjop174k1jlkgdbjg3midk001

Installation fails from the start saying driver is incompatible, with OS version. In my case i was testing on PX 6


So at this point i am unable to install drivers, as i think PX 6 is running Debian 10

This means i can not start mst service and without it interface can not be initialized

If anyone can offer advice - will be appreciated and i will keep updating this thread as I work on this further.
 

hsv

New Member
Dec 26, 2019
3
0
1
52
Hi
I have just downloaded and install 6.1.1 and would like to use Mellanox ConnectX 4 dual port (MCX454A-FCAT).
But I cannot find MST on the system, so when I run mst start I get the reply " mst: command not found"

Then I have downloaded mst from Mellanox and try to install it, but with no luck. Get this responds

root@pve:~/mft-4.13.3-6-x86_64-deb# ./install.sh
-E- There are missing packages that are required for installation of MFT.
-I- You can install missing packages using: apt-get install gcc dkms linux-headers linux-headers-generic

When I try to install as suggested I cannot get it to download and install. I got GCC installed but not the rest

Has proxmox removed support to Mellanox in 6.1.1 or is it because I have not Subscription or?
 

jw6677

Member
Oct 19, 2019
41
2
8
29
Oh, I ran into this issue as well, try this:

Code:
# NOT recommended for production use, might be redundant, but I have access to dkms, so assume it must be here.
echo "deb http://download.proxmox.com/debian/pve buster pve-no-subscription" >> /etc/apt/sources.list

# update package list
apt update

# install the missing packages - Notice the last package is different for proxmox compared to what MFT is calling for.
apt install gcc dkms pve-headers
 

hsv

New Member
Dec 26, 2019
3
0
1
52
Hi
Thanks for your very quick reply.
It removed some of the errors, so now I am down to
apt-get install linux-headers linux-headers-generic

I then found out to install these header files the installation needed to run this command
apt-get install pve-headers-5.3.10-1-pve
You need to run uname -r to find the right version.

And mst now run success full

root@pve:~/mft-4.13.3-6-x86_64-deb# mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Unloading MST PCI module (unused) - Success

Now I will try to install the driver. But that is for tomorrow.

I find it strange that they have removed support to Mellanox I thought they wrote it was for Enterprise systems?
 

jw6677

Member
Oct 19, 2019
41
2
8
29
I think that Infiniband still has drivers, but as a hobbyist, I (also) had trouble without the various different Mellanox specific software and firmwares getting anything working.

You would have noticed that just about anything you installed from Mellanox first wanted to uninstall existing stuff, that was likely the native proxmox solutions, but yeah, without the actual software (mst start, etc.), and the only infiniband resource in the proxmox forum I found (https://pve.proxmox.com/wiki/Infiniband) not addressing these issues, it is tricky.


With regard to drivers specifically, I never made it past actually getting OFED to install completely, but in this regard I do believe proxmox had much of what was required already. There are a bunch of modules I needed to activate, and never needed OFED? I think this is the proxmox support that is referred to. (Here's the list of modules I currently use a the bottom, not sure if any are junk or not but it's what I ended up with:)

Code:
#modules I am using, no guarantee that these are right, or what you need, or even work, just where I ended up.
mst_pciconf
mlx4_core
mlx4_ib
mlx4_en
ib_ipoib
ipoib_helper
ib_addr
ib_core
ib_sa
ib_mad
ib_umad
ib_uverbs
ib_cm
ib_ipoib
ib_iser
ib_isert
ib_ucm
ib_sdp

Oh yeah, and you'll need:

apt install opensm

Which you need for each Infiniband subnet you want to run (I think... Seriously take anything coming from me with a grain of salt and do some research, I am sharing to help, but really really do not know this stuff well, verify everything.)
 

iconvergence

Active Member
Aug 21, 2013
38
0
26
i have several infiniband cluster running with OFED and work good.
If you switch don't have SM, install opensmd at least on 2 node (one master, one slave)
i prefer more use mellanox opensm on the switch but also using on proxmox node for some clusters

mellanox release latest driver recently so you should be fine with proxmox 6.1
 

ndroftheline

Member
Jun 17, 2017
33
10
13
33
Hi all, I'm struggling a litle with this as well. I'm glad a couple peopel have got this to work with the instructions provided up there, but I wasn't. I'm leery of installing the OFED drivers because of all the stuff it does to the system (auto-uninstalling a bunch of seemingly-critical packages like pve-manager?) so I am quite curuios about this:

dig more on ofed and look for compile it yourselft for fix you issue for buster as there is no version ready. If need i can post help here how to do it.
Once you done it, it will fix most or all of your issue.

I am keen to know about this process more as I am using connectx-2 VPI cards (dual qsfp, 40gb ib and 10gb eth) I bought on ebay very cheap and I'm keen to know how to keep them going for awhile.

I see you still never had anyone to reach out to help or confirm your thread, but this solution worked for me with a connectX-2 as well.

i am specifically interested in knowing how this worked, because for me a. the drivers wouldn't install, and appeared that if they did install it would hose a bunch of critical packages anyway and b. connectx_port_config doesn't exist without the drivers installed? and c. mlxconfig which comes with MFT (which I successfully instaled), an alternative way of setting the mode, refused with the error "unsupported device" just as jw6677 experienced.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!