Infiniband Partition in Proxmox

I have a successful ipoib here. By partition do you mean P_keys?

Yep, P_keys is what i mean. I do have ib_ipob working just fine in my Proxmox+Ceph cluster. No issues, although not getting full speed of 40gbps due to ib_ipoib.
But would like to setup P_Keys similar to vLan of ethernet so i can put multiple subnets on same IB fabric.
 

I did not read the exact documentation you posted but something similar how-to.
I was able to successfully configure child IB locally on two nodes. They even responded to IB host ping, but no ping response if pinged using IP address. Also it took down my IB switch. The switch needed power cycle in order to restore IB connectivity. I am thinking i did something with Subnet manager. My manager runs on one of the proxmox node using opensm. I made changes to /etc/opensm/partitions.conf then restarted the opensm.

smpquery even showed the new P_Keys 0x8010 along with default one 0xffff. Do you have partition configured fine mir? Instead of restarting opensm, should i just reboot the node?
 
Finally!!! The partitions are up and running on our Infiniband QDR!!
For testing I created 3 partitions/vlans with separate subnets. For last several hours now they are communication on IB fabric without issue. Here is the steps i used:
Code:
Partition configuration is stored in:
/etc/opensm/partitions.conf

If the file does not exist create it and add the following lines:
Default=0xffff, ipoib : ALL=full;
vlan1=0x8001, ipoib : ALL=full;  #ALL can also be limit
vlan2=0x8002, ipoib : ALL=full;  #ALL can also be limit

Restart opensm : # service opensm restart

Check Partitions keys : # smpquery pkeys 4

To get the node ID where the OpenSM is running : # sminfo

Add child Interface for IB: # echo 1 > /sys/class/net/ib0/create_child

Check the interface is created : # ifconfig ib0.8001

Configure the inteface from : #nano /etc/network/interfaces

Start the IB interface : #ifup ib0.8001

The reason it was not working for me before, i was using different Hex ID. For example, i was trying to create vlan10 with key 0x000a. I even tried 0xa but did not work. Only way all worked when i used 0x8001 format. When i tried to create a sub interface with key 000a it created it as ib0.800a. Not sure why and i have not found any info on that. But even 8000 series gives quite a few vlan/partition id. So all good!
 
It appears that i have celebrated prematurely. Although the IB partitions between node work just fine, i dont seem to be able to create a virtual bridge with this additional IB partitioned ports. I also tried Open vswitch and that also did not work. So far i found out from Googling is not good. There may not be any way to create virtual bridge with IB.

Anybody got idea if creating bridge with IB ports are out of question or not?
What would be the best way to connect the IB to a VM?
 
Anybody got idea if creating bridge with IB ports are out of question or not?
What would be the best way to connect the IB to a VM?

I've looked into this a couple years ago and my conclusion was its not possible.
The only way to get IB into a guest that I can think of is to pass the IB card directly to the guest but thats not much of a solution.

Maybe something like this could solve the problem: http://vde.sourceforge.net/
KVM -> VDE over IPoIB? I've searched and found nothing on using VDE over Infiniband

Or maybe using EoIB (Ethernet over IB) instead of IPoIB would open up new doors? http://www.mellanox.com/related-docs/prod_gateway_systems/EoIB_README-1.5.1-1.3.6.txt
 
If I understand correctly, we can't use Infiniband for the VMs (guest)?

I have successfully created Ceph and it's communicating fine over the Infiniband NIC (ib0).

I have another Infiniband switch which I would like to use for the VMs to push backup out to dedicated back up servers. But as symmcom have pointed out, there isn't a way to create a bridge with the ib0.

So this mean even if I install a separate Infiniband NIC (ib1), I will not be able to use it with the VMs on separate switch?
 
That is correct. Infiniband simply cant be served directly to a VM. Backup was primary reason why i wanted to create vlan like partition to push out backup. Buy did not work. To address this what we did is setup couple of proxmox nodes in the same cluster with zfs and gluster on top and attach the gluster cluster to proxmox storage. This backups are done on separate IB device on separate storage. Worked out fine.

Sent from my SM-N910W8 using Tapatalk
 
To address this what we did is setup couple of proxmox nodes in the same cluster with zfs and gluster on top and attach the gluster cluster to proxmox storage.

Thank you very much! That helps a lot.

Could you please details on how to setup promox nodes in the same cluster with zfs and gluster on top and then attach the cluster to proxmox storage?
 
Could you please details on how to setup promox nodes in the same cluster with zfs and gluster on top and then attach the cluster to proxmox storage?
There is no tricks in this really. Simply followed Proxmox wiki to setup a Proxmox node with ZFS and Gluster. Below are simplified steps:
1. Setup 2 nodes with equal number of HDDs. I used one SSD for the OS and four 4TB SATA in each node.
2. Install Proxmox on both nodes.
3. Create ZFS pool on each node using following command:
#zpool create -f -o ashift=12 <pool_name> /dev/sdb /dev/sdc /dev/sdd /dev/sde
4. Mount the ZFS pool using the following command:
#zfs set mountpoint=/mnt/zfspool <pool_name>
#zfs mount -a
5. Install gluster using the following documentation:
http://www.gluster.org/community/documentation/index.php/Getting_started_install
6. Run the following commands on each node to attach them to gluster cluster:
On node 1: #gluster peer probe <node2_IP>
On node 2: #gluster peer probe <node1_IP>
7. Run the following command on node 1 to create Gluster volume:
#gluster volume create <name> replica 2 transport tcp <node1_IP>:/mnt/zfspool/<name> <node2_IP>:/mnt/zfspool/<name>
8. Start gluster volume: #gluster volume start <name>
9. Set gluster volume authorization to specific subnet:
#gluster volume set <name> auth.allow 127.0.0.1,<pmx.cluster.subnet.*>
10. Attach the gluster storage to Proxmox using the GUI and gluster storage plugin.
 
Thank you very much.

1) I am assuming these two nodes are on separate Infiniband switch. Correct?

2) Are these two nodes on its own ProxMox Cluster? Do you install ceph on them?

3) Since these are on separate infiniband switch, how do you get this cluster to communicate with the other Ceph cluster or Proxmox cluster to back up the VMs?

Thanks in advance for your help.
 
1) I am assuming these two nodes are on separate Infiniband switch. Correct?
You can setup either way, with separate switch on separate IB interface cards or coexists with existing IB network for Ceph if you have any. If you are using 10+Gbps IB, the backup will not consume entire bandwidth since there is a limiting factor of HDD write speed. We have both setup and performance difference is not that noticeable. If you are not sharing existing IB switch, then you have to drop in extra IB cards in all the nodes in the cluster which will be connected through separate switch.
2) Are these two nodes on its own ProxMox Cluster? Do you install ceph on them?
The nodes does not need to be on their own cluster at all. Also do not need to install ceph on them since their only duty is to server Gluster on ZFS for backup storage. The nodes can be part of existing Proxmox cluster so you can monitor them through same GUI. In that case, the nodes will have one interface(Gigabit) for Proxmox cluster communications while the IB interface is used for backup.
3) Since these are on separate infiniband switch, how do you get this cluster to communicate with the other Ceph cluster or Proxmox cluster to back up the VMs?
By installing separate IB cards in each nodes. Let's assume in your existing environment each node has One Gigabit NIC used for Proxmox cluster communication and One IB NIC used for Ceph network. For your backup network you are going to drop in another IB NIC in each node and configure them with a new subnet. You are basically creating a new network for backups.

Hope these makes sense and i did not make it any harder.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!