[SOLVED] Slow 40GBit Infiniband on Proxmox 8.1.4

Jul 4, 2022
54
6
8
Poland
Hello folks,

I have 4 nodes cluster build on Dell r730 and r740.
I wanted to improve Ceph performance by installing faster network cards so I bought Mellanox CX314A cards and Mellanox SX6036 managed switch.
Everything is connected and running but OMG terribly slow, it's slower than 10GBit Ethernet I had on my Intel cards.
All I can get from iperf3 is 7,6GBit.
I've read a lot of posts but I cannot find anything useful, some say I should switch to Ethernet but at the moment I don't have a VPI license on my switch so Ethernet isn't working only Infiniband.
Has anyone properly working Infiniband configuration and could share some experience please?

EDIT: I've changed MTU to max 65520, after that I have transfers around 9.6Gbit still far away from 40.
 
Last edited:
  • Like
Reactions: Kingneutron
Hello folks,

I have 4 nodes cluster build on Dell r730 and r740.
I wanted to improve Ceph performance by installing faster network cards so I bought Mellanox CX314A cards and Mellanox SX6036 managed switch.
Everything is connected and running but OMG terribly slow, it's slower than 10GBit Ethernet I had on my Intel cards.
All I can get from iperf3 is 7,6GBit.
I've read a lot of posts but I cannot find anything useful, some say I should switch to Ethernet but at the moment I don't have a VPI license on my switch so Ethernet isn't working only Infiniband.
Has anyone properly working Infiniband configuration and could share some experience please?

EDIT: I've changed MTU to max 65520, after that I have transfers around 9.6Gbit still far away from 40.

Make sure iperf3 is same version on both sides, try -R reverse and ' -P # ' for # of Parallel transfers
 
i replicated this setup and first go around i run into the same numbers no matter how many parallels i threw at it. this indicated to me that i made a mistake in the config.

Code:
[SUM]   0.00-10.00  sec  9.42 GBytes  8.09 Gbits/sec  57324             sender

ibstatus showed no issues at they phy level:
Code:
Infiniband device 'ibp1s0' port 1 status:
        default gid:     fe80:0000:0000:0000:f452:1403:0091:0061
        base lid:        0x5
        sm lid:          0x3
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            40 Gb/sec (4X QDR)
        link_layer:      InfiniBand

finally i ran ibdiagnet and spotted my issue:

Code:
-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

oops, minor mistake, checked it again. SAME issue. next i checked the ibportstate:

Code:
ibportstate 5 1
CA/RT PortInfo:
# Port info: Lid 5 port 1
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................5
SMLid:...........................4
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
LinkSpeedExtSupported:...........0
LinkSpeedExtEnabled:.............0
LinkSpeedExtActive:..............No Extended Speed
Mkey:............................<not displayed>
MkeyLeasePeriod:.................0
ProtectBits:.....................0
# MLNX ext Port info: Lid 5 port 1
StateChangeEnable:...............0x00
LinkSpeedSupported:..............0x00
LinkSpeedEnabled:................0x00
LinkSpeedActive:.................0x00

from here however, im not sure what to do to fix this as its 40gbps in the switch and OS (via ethtool).
 
Last edited:
  • Like
Reactions: itret
I ended up buying ETH license for switch and changing configuration to Ethernet, now it's faster and iperf3 shows average 33Gbps which I think is pretty much all I can get from ConnectX-3 cards.
That works, one thing i forgot to mention is that you set the MTU on the cards to 65520, but that wont work with your switch, the SX6036 has a max MTU size of 4096. if you correct that you might be able to get a little more performance out of it if its not fragmenting.
 
  • Like
Reactions: itret
That works, one thing i forgot to mention is that you set the MTU on the cards to 65520, but that wont work with your switch, the SX6036 has a max MTU size of 4096. if you correct that you might be able to get a little more performance out of it if its not fragmenting.
I believe this value is for Infiniband. I've lowered MTU on both switch and NIC to 9000. Also my ConnectX-3 CX314A are dual port so I'm thinking about LACP LAG.

1713853994592.png
Please correct me if I'm wrong
 
yeah, the CX-3 with IPoIB doesnt seem to be working right, or theres some tuning im not doing correctly. i plugged in the cx-4 EDR cards and they are doing great at 40gbps with IPoIB out of box.

probly best to leave it with eth since thats working and mark it solved.
 
  • Like
Reactions: Kingneutron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!