Hello everybody,
Bellow is the simple scheme of my network
[ pve host ib0 ] < -----> [infiniband switch] <----> [ ib0 gate ix0 ] <----> [ix0 FreeNas box]
(i have to use ib-to-10gbe gateway due to the lack of Infiniband support in FreeNas9)
pve ib0: 172.16.253.15/24 (PVE network: ethernet over infiniband)
gate ib0: 172.16.253.2
gate ix0: 10.10.10.2
FreeNas ix0: 10.10.10.254
on each PVE host there is a static route:
Destination Gateway Genmask Flags Metric Ref Use Iface
10.10.10.254 172.16.253.2 255.255.255.255 UGH 0 0 0 ib0
This solution had been working fine (for almost 2 yeas) till the latest kernel: pve-kernel-4.2.8-1-pve
With any previous kernel I got:
With current:
I've tried and checked:
- IB Driver versions on all the hosts and gate (all the same)
- Reboot IB switch (just in case)
- Reboot FreeNas box (just in case)
- Change mtu on IB interface
- Checked on 3 different nodes (Supermicro/Dell)
When I downgrade kernel to previous versions- network throughput returns to normal values.
Any ideas on what could be reason of such behavior? I would very appreciate any assistance!
Kind regards,
Bellow is the simple scheme of my network
[ pve host ib0 ] < -----> [infiniband switch] <----> [ ib0 gate ix0 ] <----> [ix0 FreeNas box]
(i have to use ib-to-10gbe gateway due to the lack of Infiniband support in FreeNas9)
pve ib0: 172.16.253.15/24 (PVE network: ethernet over infiniband)
Code:
auto ib0
iface ib0 inet static
address 172.16.253.15
netmask 255.255.255.0
pre-up echo connected > /sys/class/net/ib0/mode
mtu 9000
up route add -host 10.10.10.254 gw 172.16.253.2
gate ib0: 172.16.253.2
gate ix0: 10.10.10.2
FreeNas ix0: 10.10.10.254
on each PVE host there is a static route:
Destination Gateway Genmask Flags Metric Ref Use Iface
10.10.10.254 172.16.253.2 255.255.255.255 UGH 0 0 0 ib0
This solution had been working fine (for almost 2 yeas) till the latest kernel: pve-kernel-4.2.8-1-pve
With any previous kernel I got:
Code:
Linux pve02A 4.2.6-1-pve #1 SMP Thu Jan 28 11:25:08 CET 2016 x86_64 GNU/Linux
root@pve01C:~# iperf -c 10.10.10.254
------------------------------------------------------------
Client connecting to 10.10.10.254, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.253.15 port 57150 connected with 10.10.10.254 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 9.01 GBytes 7.74 Gbits/sec
With current:
Code:
Linux pve02A 4.2.8-1-pve #1 SMP Sat Mar 19 10:44:29 CET 2016 x86_64 GNU/Linux
root@pve02A:~# iperf -c 10.10.10.254
------------------------------------------------------------
Client connecting to 10.10.10.254, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.253.15 port 39912 connected with 10.10.10.254 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.2 sec 14.0 MBytes 11.5 Mbits/sec
I've tried and checked:
- IB Driver versions on all the hosts and gate (all the same)
- Reboot IB switch (just in case)
- Reboot FreeNas box (just in case)
- Change mtu on IB interface
- Checked on 3 different nodes (Supermicro/Dell)
When I downgrade kernel to previous versions- network throughput returns to normal values.
Any ideas on what could be reason of such behavior? I would very appreciate any assistance!
Kind regards,