infiniband only 10 gbps warning group

bonkersdeluxe

Member
Jan 20, 2014
27
1
23
Hi @all

I have an issue and i dont know to solve it.
I connect mellanox 40 Gbit infinitiband Card to an infinitband switch per 40 Gbit Cables.

Output ibstatus
Code:
root@vsrv2:~# ibstatus
Infiniband device 'mlx4_0' port 1 status:
    default gid:     fe80:0000:0000:0000:0002:c903:000a:60e9
    base lid:     0x2
    sm lid:         0x1
    state:         4: ACTIVE
    phys state:     5: LinkUp
    rate:         40 Gb/sec (4X QDR)
    link_layer:     InfiniBand

Infiniband device 'mlx4_0' port 2 status:
    default gid:     fe80:0000:0000:0000:0002:c903:000a:60ea
    base lid:     0x3
    sm lid:         0x1
    state:         4: ACTIVE
    phys state:     5: LinkUp
    rate:         40 Gb/sec (4X QDR)
    link_layer:     InfiniBand

There shines all fine.

My /etc/network/interfaces
ib0 ib1
Code:
auto ib0
iface ib0 inet static
        address  10.10.15.2
        netmask  255.255.255.0
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib0/mode
        mtu 65520

auto ib1
iface ib1 inet static
        address  10.10.15.20
        netmask  255.255.255.0
        pre-up modprobe ib_ipoib
        pre-up echo connected > /sys/class/net/ib1/mode
        mtu 65520


But on ibdiagnet

Code:
ibdiagnet
Loading IBDIAGNET from: /usr/lib/x86_64-linux-gnu/ibdiagnet1.5.7
-W- Topology file is not specified.
    Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/x86_64-linux-gnu/ibdm1.5.7
-W- A few ports of local device are up.
    Since port-num was not specified (-p option), port 1 of device 1 will be
    used as the local port.
-I- Discovering ... 3 nodes (1 Switches & 2 CA-s) discovered.


-I---------------------------------------------------
-I- Bad Guids/LIDs Info
-I---------------------------------------------------
-I- No bad Guids were found

-I---------------------------------------------------
-I- Links With Logical State = INIT
-I---------------------------------------------------
-I- No bad Links (with logical state = INIT) were found

-I---------------------------------------------------
-I- General Device Info
-I---------------------------------------------------

-I---------------------------------------------------
-I- PM Counters Info
-I---------------------------------------------------
-I- No illegal PM counters values were found

-I---------------------------------------------------
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---------------------------------------------------
-I-    PKey:0x7fff Hosts:4 full:4 limited:0

-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

-I---------------------------------------------------
-I- Bad Links Info
-I- No bad link were found
-I---------------------------------------------------
----------------------------------------------------------------
-I- Stages Status Report:
    STAGE                                    Errors Warnings
    Bad GUIDs/LIDs Check                     0      0    
    Link State Active Check                  0      0    
    General Devices Info Report              0      0    
    Performance Counters Report              0      0    
    Partitions Check                         0      0    
    IPoIB Subnets Check                      0      1    

Please see /var/cache/ibutils/ibdiagnet.log for complete log
----------------------------------------------------------------
 
-I- Done. Run time was 1 seconds.

This Warning:
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

How can i solve it? I think its the group rate. 10 Gbps not 40 gbps.
My switch is an voliare 4036 infiniband switch.
SM Manager on the switch is master and enabled

Log of sm-info show

Code:
4036-5A04# sm-info show
subnet manager info is:
         sweep_interval:               15
         max_wire_smps:                16
         lmc:                          0
         max_op_vls:                   5
         transaction_timeout:          150
         head_of_queue_lifetime:       16
         leaf_head_of_queue_lifetime:  16
         packet_life_time:             18
         sminfo_polling_timeout:       5000
         polling_retry_number:         12
         reassign_lids:                disable
         babbling_port_policy:         disable
         routing_engine_names:         minhop
         log_flags:                    7
         force_link_speed:             0
         polling_rate:                 30
         mode:                         enable
         state:                        master
         sm_priority:                  15

When i check iperf i only get 8.66 Gbits/sec


Code:
Client connecting to 10.10.15.21, TCP port 5001
TCP window size: 2.50 MByte (default)
------------------------------------------------------------
[  3] local 10.10.15.2 port 41372 connected with 10.10.15.21 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  10.1 GBytes  8.66 Gbits/sec
root@vsrv2:~#


I hope anybody can help me. I stuck at this point with headache.

Thank you!

Sincerely Bonkersdeluxe
 

bonkersdeluxe

Member
Jan 20, 2014
27
1
23
So i guess i found it.

lspci | grep Mellanox
05:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)

ib_ipoib This card speed ist 10GBE. How can i use the 40 GBE Infiniti protocol under ceph?
Thank you!

Sincerely Bonkersdeluxe
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!