infiniband configuration questions

RobFantini

Famous Member
May 24, 2012
2,023
107
133
Boston,Mass
we're following http://pve.proxmox.com/wiki/Infiniband#Without_Bonding .

on a pair machines used in pve as nfs storage ,

after bring up the interface , we get - from dmesg:
Code:
[ 9685.078436] ib0: enabling connected mode will cause multicast packet drops

[ 9685.082542] ib0: mtu > 2044 will cause multicast packet drops.

[ 9685.083261] ADDRCONF(NETDEV_UP): ib0: link is not ready

Are the multicast messages something we should deal with before proceding ?
 
Last edited:
Re: infiniband : "enabling connected mode will cause multicast packet drops"

commenting these 2 lines from the wiki suggested /etc/network/interfaces eliminated the multicast issue:
Code:
#        pre-up echo connected > /sys/class/net/ib0/mode
#        mtu 65520

I'm not sure if that causes other issues...
 
Re: infiniband : "enabling connected mode will cause multicast packet drops"

Connected mode is much faster with IPoIB, if I remember right without connected mode I only got about 1Gbit throughput.
Larger MTU is also much faster than MTU of 1500

If the only multicast application you have on the Infiniband network is Proxmox itself, then do not worry about the warning:
Code:
[ 9685.082542] ib0: mtu > 2044 will cause multicast packet drops.
Reason is totem has its own MTU setting that defaults to 1500 which works fine in connected mode IPoIB.

If you are worried about it, see my notes here:
http://pve.proxmox.com/wiki/Multicast_notes#Multicast_with_Infiniband
As mentioned there, it is still untested as I have never seen a need to change the netmtu for totem.

I currently have 20 nodes in a single Proxmox cluster using IPoIB in connected mode.
Proxmox is using the IPoIB network for its multicast clustering communications.
16 of those nodes are also replicating data using DRBD on the IPoIB network.

Works perfectly fine for me.

In a little over one year we have only had one problem with Infiniband.
The master subnet manager on one of our four switches stopped allowing topology changes for some odd reason.
Since we had bonded connections to redundant switches on every server I decided to just power cycle the bad switch, problem went away and the bonded connections failed over to the other switch just fine with no disruption.
 
Re: infiniband : "enabling connected mode will cause multicast packet drops"

e100 - thank you for the response.

what type of infiniband switch do you use?
 
Re: infiniband : "enabling connected mode will cause multicast packet drops"

e100- thank you for your help so far... IB would have been impossible to get done without your help.

We have a topspin 120 and 6 systems attached. ping and ssh works using the infiniband [ ib ] IP addresses.

I've a couple of questions.

1- we've another topspin 120 on order from ebay. . How do you connect them together? I had assumed just connect a cable to each switch, but the hardware manual says to use a 'Infiniband cable inter-switch link (ISL) .

2- to connect to lan and wan do you use the IB network and routing, or just use ethernet nics and network on pve hosts?



anyone else who knows about Topspin / Infiniband in a pve set up please chime in.
 
Hi Rob, always glad to help especially when I can share knowledge on a subject that few others have dealt with. You will be helping someone in the future and all of our knowledge will grow.

Wish I had more free time to post here, been far too busy lately and Proxmox has been running flawless so nothing to complain about. ;-)

I turned on the subnet manager on all switches then just connect them with same cables you use for servers. They will elect a master and just work, wish Ethernet was this easy!

Use two or more cables for redundancy no setup required just plug it in.

We have four switches, two per cabinet.
Switch A and B in one cabinet
Switch C and D in other
List of every connection between them:
2 links A - B
2 links C - D
1 link A - C
1 link A - D
1 link B - C
1 link B - D

Every node has dual IB ports with each port connected to a different switch.

Switch failures, link failures it just keeps on working.

We had the master subnet manager get in some weird state once, power cycled that switch and not even a blip in connectivity.

It is the subnet manager that makes it all work detecting redundant paths and such.

IB can be more complex like using multiple subnets assigning ports to subnets and such, use a single subnet and it is super simple.

My IB network is isolated doing only proxmox cluster and drbd data. Because these are EOL and will not get any software updates it could be risky to make them Internet accessible.

I use bonded gig Ethernet to connect to everything else.

Did you get the latest firmware on your switch? All of mine came with older firmware.

There are also some java based utilities that you might find handy.
 
OK I see it is good to have storage and pve on an infiniband network.

For pve https and services running in vm's like ssh , imap, smtp we'd use ethernet network to connect to lan and wan. Do I have that correct?
 
e100 : I've a couple more questions.

Is firmware available without a Cisco support contract? I've spent 2 hours trying to get it and it looks like we've got to purchase support..

Subnet manager - I've read the docs and see where to add, but could you suggest setting that work?

Thanks you.
 
e100 - thanks the firmware is all set. I'll add to will add info to wiki later... for now I've a lot of notes on our internal one.

Next question :D

Subnet manager - I've set it up and connected the two switches with a cable.

Both show as 'Oper-Status' standby .

and do not show up on 'Topology' ..

I'll continue to check manual and search.........
But how do you tell when subnet is working OK? so that we have an avtice and standby connection from each nfs and pve host?
 
Here is the config from one of my switches, they are all the same other than ip addresses/names:
Code:
Topspin-120 #1> show config
!   TopspinOS-2.9.0/build170
!   Sat Aug  3 00:11:14 2013
enable
config terminal
!
boot-config primary-image-source TopspinOS-2.9.0/build170
!
ib sm subnet-prefix fe:80:00:00:00:00:00:00 priority 0
!
interface mgmt-ethernet
 ip address x.x.x.x 255.255.255.0
 gateway x.x.x.x
 no shutdown
!
location "Cabinet #2"
!
ntp server-one x.x.x.x
ntp server-two x.x.x.x
!
username admin password 7 xxxxxxx
username admin community-string xxxxxxx
username guest password 7 xxxxxxx
username guest community-string xxxxxx
username super password 7 xxxxxxxx
username super community-string xxxxxxx
!
!
snmp-server contact "x@x.com"
hostname "Topspin-120 #1"
!

Looking for the master subnet manager is not as simple as it seems, the command outputs different data depending on if the node you run it on is master or standby.

From a standby you will only see the master:
Code:
Topspin-120 #2> show ib sm sm-info subnet-prefix fe:80:00:00:00:00:00:00




================================================================================
                      Discovered Subnet Managers in Fabric
================================================================================
            subnet-prefix : fe:80:00:00:00:00:00:00
                port-guid : 00:05:ad:00:00:02:xx:xx
                 priority : 0
                 sm-state : master
                   sm-key : 00:00:00:00:00:00:00:00
                act-count : 1091199584

From a master you only see the standbys:
Code:
Topspin-120 #1# show ib sm sm-info subnet-prefix fe:80:00:00:00:00:00:00




================================================================================
                      Discovered Subnet Managers in Fabric
================================================================================
            subnet-prefix : fe:80:00:00:00:00:00:00
                port-guid : 00:05:ad:00:00:02:xx:xx
                 priority : 0
                 sm-state : standby
                   sm-key : 00:00:00:00:00:00:00:00
                act-count : 40572476


            subnet-prefix : fe:80:00:00:00:00:00:00
                port-guid : 00:05:ad:00:00:02:xx:xx
                 priority : 0
                 sm-state : standby
                   sm-key : 00:00:00:00:00:00:00:00
                act-count : 27459063


            subnet-prefix : fe:80:00:00:00:00:00:00
                port-guid : 00:05:ad:00:00:02:xx:xx
                 priority : 0
                 sm-state : standby
                   sm-key : 00:00:00:00:00:00:00:00
                act-count : 25564059

As you can see this is really simple to configure once you know how to do it.
That ISO also has some utilities, those can be useful in understanding how it all works, they connect to the switches using SNMP so you will need to set the community strings like I have in my config above.
 
e100 - I've tried to install the topspin 'element manager' program to win7 and wheezy... both failed. Which operating system do you run em from?
 
e100 - I've tried to install the topspin 'element manager' program to win7 and wheezy... both failed. Which operating system do you run em from?

I've installed this on Ubuntu many times with no problem, I bet it will work on Debian just as well.
If you use x64 it is more difficult as they only provide 32bit binary or Itanium and I do not know anyone using Itanium for workstations...

I recently rebuilt my machine using Ubuntu 13.04 and needed to re-install this anyway so documented it, I am quite confident that this procedure will work on other versions of Ubunut/debian too.

Code:
cd /media/username/CDROM/em/Linux
cp install_linux_x86.bin ~/
cd ~
chmod 755 install_linux_x86.bin        
./install_linux_x86.bin
rm install_linux_x86.bin

To Run:
~/Cisco_SFS_EM

Enter in hostname/ip
Enter in community string you want to use (should have previously set this when configuring the switches)

Once open go to Infiniband -> Topology View
Add the ip/name/community string for all switches in this topology

NOTE for x64 systems: I already had ia32-libs-multiarch and some other things installed so I am not sure if that is the only lib you will need to install and run this:
Code:
sudo apt-get install ia32-libs-multiarch
 
This is the contents of the Launcher I created:

› cat IB_Manager.desktop 86
[Desktop Entry]
Name=Cisco IB Manager
Comment=Cisco IB Manager
Exec=/home/username/CiscoSFSElementManager/Cisco_SFS_EM
Icon=/home/username/CiscoSFSElementManager/help/EMhelp/images/FC3429.jpg
Terminal=0
Type=Application
 
for Wheezy to fix this:
Code:
t430u  ~ $ ./install_linux_x86.bin
Preparing to install...
Extracting the JRE from the installer archive...
Unpacking the JRE...
Extracting the installation resources from the installer archive...
Configuring the installer for this system's environment...
strings: '/lib/libc.so.6': No such file


Launching installer...


Invocation of this Java Application has caused an InvocationTargetException. This application will now exit. (LAX)


Stack Trace:
java.lang.UnsatisfiedLinkError: /tmp/install.dir.7573/Linux/resource/jre/lib/i386/libawt.so: libXp.so.6: cannot open shared object file: No such file or directory
        at java.lang.ClassLoader$NativeLibrary.load(Native Method)
        at java.lang.ClassLoader.loadLibrary0(Unknown Source)
        at java.lang.ClassLoader.loadLibrary(Unknown Source)
        at java.lang.Runtime.loadLibrary0(Unknown Source)
        at java.lang.System.loadLibrary(Unknown Source)
        at sun.security.action.LoadLibraryAction.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.awt.NativeLibLoader.loadLibraries(Unknown Source)
        at sun.awt.DebugHelper.<clinit>(Unknown Source)
        at java.awt.Component.<clinit>(Unknown Source)
        at com.zerog.ia.installer.LifeCycleManager.f(DashoA8113)
        at com.zerog.ia.installer.LifeCycleManager.g(DashoA8113)
        at com.zerog.ia.installer.LifeCycleManager.a(DashoA8113)
        at com.zerog.ia.installer.Main.main(DashoA8113)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at com.zerog.lax.LAX.launch(DashoA8113)
        at com.zerog.lax.LAX.main(DashoA8113)
This Application has Unexpectedly Quit: Invocation of this Java Application has caused an InvocationTargetException. This application will now exit. (LAX)

apt-file search shows :
Code:
apt-file search libXp.so.6
libxp6: /usr/lib/i386-linux-gnu/libXp.so.6
libxp6: /usr/lib/i386-linux-gnu/libXp.so.6.2.0

so the above got fixed and em installs after:
Code:
aptitude install libxp6


I still need to test the program after I configure the snmp stuff.
 
e100 - until now I've never used snmp . So need to learn about snmp server and topspin snmp client.

I'm setting up a snmp server on wheezy. This is in /etc/snmp/snmpd.conf
Code:
#  AGENT BEHAVIOUR
agentAddress udp:161,udp6:[::1]:161

#  ACCESS CONTROL
rocommunity  admin   10.0.0.0/8

I've been trying to follow that Topspin cli manual and referencing the config you posted above.

so from cli at topspin switch:
Code:
snmp-server host x.x.x.x
snmp-server user admin enable
snmp-server user admin privilege unrestricted-rw

Is the above heading in the correct direction?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!