Mellanox MCX653106A-ECAT Support

dan.ger

Active Member
May 13, 2019
82
7
28
Hello,

we have 3 nodes that uses 24 NVmes (8 drives per node) with Ceph and bonded 2x Intel 10GBe Adapters and we plan to buy the Mellanox MCX653106A-ECAT-SP (Connected as Meshup wit DAC cables for 200 Gbe).

- Are these cards supported by Proxmox with debian MLNX_OFED driver?
- So are there any problems with Mellanox cards especially on upgading Proxmox to knew version?

Sorry for that stupid questions but I do not want to waste the money.

Kind regards,
Daniel
 
I cannot answer the first question from experience.

The second one though I can answer to some degree: Sometimes there can be issues with newer kernels. The normal procedure in that situation would be to check for firmware updates. Updating the firmware on mellanox cards is one of the easiest one that I have experienced.
 
Thanks a lot for answering the questions. We boght 3 Cards for each server and give mellanox cards a try :)
 
Hello,

I try to bring up that interfaces but if I follow the guidelines for routed Meshup and infiniband mode, I cannot set the comment out lines

Node1:
Code:
auto ib0
iface ib0 inet static
    address 10.10.20.1/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib0/mode
#    mtu 65520
    up ip route add 10.10.20.2/32 dev ib0
    down ip route del 10.10.20.2/32

auto ib1
iface ib1 inet static
    address 10.10.20.1/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib1/mode
#    mtu 65520
    up ip route add 10.10.20.3/32 dev ib1
    down ip route del 10.10.20.3/32

Node2:
Code:
auto ib0
iface ib0 inet static
    address 10.10.20.2/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib0/mode
#    mtu 65520
    up ip route add 10.10.20.1/32 dev ib0
    down ip route del 10.10.20.1/32

auto ib1
iface ib1 inet static
    address 10.10.20.2/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib1/mode
#    mtu 65520
    up ip route add 10.10.20.3/32 dev ib1
    down ip route del 10.10.20.3/32

Node 3:
Code:
auto ib0
iface ib0 inet static
    address 10.10.20.3/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib0/mode
#    mtu 65520
    up ip route add 10.10.20.1/32 dev ib0
    down ip route del 10.10.20.1/32

auto ib1
iface ib1 inet static
    address 10.10.20.3/24
    pre-up modprobe ib_ipoib
#    pre-up echo connected > /sys/class/net/ib1/mode
#    mtu 65520
    up ip route add 10.10.20.2/32 dev ib1
    down ip route del 10.10.20.2/32

ifconfig ib0:
Code:
ib0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        inet 10.10.20.3  netmask 255.255.255.0  broadcast 10.10.20.255
        unspec 00-00-01-82-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
ifconfig ib1:
Code:
ib1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 4092
        inet 10.10.20.3  netmask 255.255.255.0  broadcast 10.10.20.255
        unspec 00-00-02-95-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

mst status:
Code:
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded

MST devices:
------------
/dev/mst/mt4123_pciconf0         - PCI configuration cycles access.
                                   domain:bus:dev.fn=0000:3b:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
                                   Chip revision is: 00

ip route:
Code:
10.10.20.0/24 dev ib0 proto kernel scope link src 10.10.20.3 linkdown
10.10.20.0/24 dev ib1 proto kernel scope link src 10.10.20.3 linkdown
10.10.20.1 dev ib0 scope link linkdown
10.10.20.2 dev ib1 scope link linkdown

Any idea what I have misconfigured? Also I cannot ping the other hosts 10.10.20.1-3 only the host itself.
 
Last edited:
Very stupid question, but did you plug in the cables to the right NIC? What happens if you switch the routes to the other interface or switch the cables?
 
Very stupid question, but did you plug in the cables to the right NIC? What happens if you switch the routes to the other interface or switch the cables
Hello,
this is not a stupid question, I'll check that with ibstatus if links are up if I reboot a server. While the server is rebooting, the Link is shown as unplugged/down. so I think the connection should work/cables are plugge in,

ibstatus output:
Code:
Infiniband device 'mlx5_0' port 1 status:
        default gid:     fe80:0000:0000:0000:b8ce:f603:005d:42ae
        base lid:        0xffff
        sm lid:          0x0
        state:           2: INIT
        phys state:      5: LinkUp
        rate:            56 Gb/sec (4X FDR)
        link_layer:      InfiniBand

Infiniband device 'mlx5_1' port 1 status:
        default gid:     fe80:0000:0000:0000:b8ce:f603:005d:42af
        base lid:        0xffff
        sm lid:          0x0
        state:           2: INIT
        phys state:      5: LinkUp
        rate:            56 Gb/sec (4X FDR)
        link_layer:      InfiniBand

Cards are in mode: infiniband (not ethernet!)

loaded modules:
Code:
mlx5_ib               376832  0
ib_uverbs             135168  10 rdma_ucm,mlx5_ib
ib_core               315392  9 rdma_cm,ib_ipoib,iw_cm,ib_iser,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
mlx5_core            1368064  1 mlx5_ib
pci_hyperv_intf        16384  1 mlx5_core
mdev                   24576  2 vfio_mdev,mlx5_core
tls                    73728  1 mlx5_core
mlxfw                  28672  1 mlx5_core
psample                20480  1 mlx5_core
mlx_compat             65536  11 rdma_cm,ib_ipoib,iw_cm,ib_iser,ib_umad,ib_core,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,mlx5_core
 
Last edited:
if I call ibhosts with port 0 and port 1. What should be expected? I get:

Host 1 ibhosts -P 0
Code:
Ca      : 0xb8cef603005d458f ports 1 "pve-03 HCA-2"
Ca      : 0xb8cef603005d403e ports 1 "pve-01 HCA-1"

Host 1 ibhosts -P 1
Code:
Ca      : 0xb8cef603005d458f ports 1 "pve-03 HCA-2" 
Ca      : 0xb8cef603005d403e ports 1 "pve-01 HCA-1"

Should the port not shown the topology, for example:
Host1 => Host2
Host1 => Host3
 
I installed the cards on each host with that steps:

Code:
1. check if mellanox is present:
lspci | grep Mellanox

2. install pve-headers:
aptitude install pve-headers

3. reboot system
reboot

4. create mellanox repo:
cd /etc/apt/sources.list.d/
wget https://linux.mellanox.com/public/repo/mlnx_ofed/latest/debian10.5/mellanox_mlnx_ofed.list
wget -qO - https://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add -

5. install driver:
aptitude install mlnx-ofed-basic

6. install firmwareupdater
aptitude install mlnx-fw-updater
 
  • Like
Reactions: rohitp
I just get mode and mtu working without any issues, but ip route shows link down. To use mode connected and mtu do following:

1.Disable ipoib_enhanced in /etc/modprobe.d/ib_ipoib.conf:
Code:
options .... ipoib_enhanced=0 ....

2.Restart openibd service:
Code:
service openibd restart
/etc/init.d/openibd restart

3. Check ipoib_enhanced is disabled:
Code:
cat /sys/module/ib_ipoib/parameters/ipoib_enhanced

4. Check mode is set:
Code:
cat /sys/class/net/ib*/mode
 
Last edited:
So I just recognize that the flag running is missing of ib0/ib1
Code:
ib0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 65520
        inet 10.10.20.3  netmask 255.255.255.0  broadcast 10.10.20.255
        unspec 80-00-02-46-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ib1: flags=4099<UP,BROADCAST,MULTICAST>  mtu 65520
        inet 10.10.20.3  netmask 255.255.255.0  broadcast 10.10.20.255
        unspec 80-00-02-44-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I bought the cards from Dell with their branded Dell QFSP28 cables (which should be the original mellanox cable). Are the wrong cables delivered?
 
Last edited:
Have you tried to reduce the complexity to see if the network connection itself will work?

E.g. configure and plug in only one port without any special routes or anything. Just a simple interface with one IP address. Same on the other server that you connect to.

This should help to avoid any problems that might be there because of the meshed setup at first. Once that works, you can take the next step and try to get it working in a meshed setup.

I unfortunately don't have much experience with IP over infinyband :-/
 
I try that, but same result... But if I start opensm -g {Port-Guid} --daemon the interfaces show the flag running and route and also ip route shows that the links are up, but I cannot ping the hosts...
 
Last edited:
I created for each node and port an opensm.conf file:

pve-01 /etc/opensm/opensm.ib0.conf
Code:
guid {{PortGuid0}}
daemon TRUE
log_file /var/log/opensm.ib0.log
dump_files_dir /var/log/opensm/ib0

pve-01 /etc/opensm/opensm.ib1.conf
Code:
guid {{PortGuid0}}
daemon TRUE
log_file /var/log/opensm.ib1.log
dump_files_dir /var/log/opensm/ib1

for pve-02 and 03, I also create the same configs for ib0 and ib1 (replace the {{PortGuida}} with your guids.

Then i start the opensm for each port on each node:
Code:
opensm --config /etc/opensm/opensm.ib0.conf
opensm --config /etc/opensm/opensm.ib1.conf

Then I have 3 subnets foreach direct links. After that I was able to ping via ibping on each subnet the other node which is connected directly.

I initialize all ports on each subnet with the new ipv4 subnet like 10.10.1.1/24 => 10.10.1.2, 10.10.2.1 10.10.2.3, and so on.. but i was not able to ping via ping a direct connection. iboip is loaded as module. so I think something is missing.
 
Fixerd :)
Firewall Rule permits local lan. After cunfiguration of datacenter firewall rule everything works fine. But I switched to ethernet-mode with ROCE. And it runs very smooth.
 
  • Like
Reactions: aaron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!