I have two potential explanations for the low throughout of my 40/56gbps IPoIB:
- There might be a bottleneck in the PCIe slots that my IB cards sit in (I have relatively small servers with only few PCI lanes and the available slots may have too few lanes to fully saturate the connections)
- It might be an (artificial) limitation on the part of the subnet manager: ibdiagnet, inter alia, gives this output:
Code:
-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
So the individual nodes are connected via 40gbps but still, as a group, are limited to 10gbps. I found this on the internet
https://forums.developer.nvidia.com/t/how-do-i-set-the-subnet-rate/206724/2
which details how to set the subnet rate - but only for opensm (which, if I understand correctly, is a subnet manager one can run on a node, if the switch doesn't have a subnet manager but I chose my SX6036 specifically because it does have its own SM, so I would prefer to use that one). Now I need to figure out whether my SX6036's SM defaults to 10gbps and, if it does, how I can change that.
It seems that I cannot change the default rate for the default subnet. But I can create a new subnet and specify a rate of, say, 56gbps. But then the question is how I get the nodes/cards to use that new subnet instead of the default one. Probably via the PKey parameter. Let's see...