Huge Infinuband network throughput degradation with the latest 4.2.8-1-pve kernel

Whatever

Renowned Member
Nov 19, 2012
393
63
93
Hello everybody,

Bellow is the simple scheme of my network

[ pve host ib0 ] < -----> [infiniband switch] <----> [ ib0 gate ix0 ] <----> [ix0 FreeNas box]

(i have to use ib-to-10gbe gateway due to the lack of Infiniband support in FreeNas9)

pve ib0: 172.16.253.15/24 (PVE network: ethernet over infiniband)

Code:
auto ib0
iface ib0 inet static
        address  172.16.253.15
        netmask  255.255.255.0
        pre-up echo connected > /sys/class/net/ib0/mode
        mtu 9000
        up route add -host 10.10.10.254 gw 172.16.253.2

gate ib0: 172.16.253.2
gate ix0: 10.10.10.2
FreeNas ix0: 10.10.10.254

on each PVE host there is a static route:

Destination Gateway Genmask Flags Metric Ref Use Iface
10.10.10.254 172.16.253.2 255.255.255.255 UGH 0 0 0 ib0

This solution had been working fine (for almost 2 yeas) till the latest kernel: pve-kernel-4.2.8-1-pve

With any previous kernel I got:
Code:
Linux pve02A 4.2.6-1-pve #1 SMP Thu Jan 28 11:25:08 CET 2016 x86_64 GNU/Linux
root@pve01C:~# iperf -c 10.10.10.254
------------------------------------------------------------
Client connecting to 10.10.10.254, TCP port 5001
TCP window size:  325 KByte (default)
------------------------------------------------------------
[  3] local 172.16.253.15 port 57150 connected with 10.10.10.254 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  9.01 GBytes  7.74 Gbits/sec

With current:
Code:
Linux pve02A 4.2.8-1-pve #1 SMP Sat Mar 19 10:44:29 CET 2016 x86_64 GNU/Linux
root@pve02A:~# iperf -c 10.10.10.254
------------------------------------------------------------
Client connecting to 10.10.10.254, TCP port 5001
TCP window size:  325 KByte (default)
------------------------------------------------------------
[  3] local 172.16.253.15 port 39912 connected with 10.10.10.254 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.2 sec  14.0 MBytes  11.5 Mbits/sec

I've tried and checked:
- IB Driver versions on all the hosts and gate (all the same)
- Reboot IB switch (just in case)
- Reboot FreeNas box (just in case)
- Change mtu on IB interface
- Checked on 3 different nodes (Supermicro/Dell)

When I downgrade kernel to previous versions- network throughput returns to normal values.

Any ideas on what could be reason of such behavior? I would very appreciate any assistance!

Kind regards,