[TUTORIAL] ConnectX VPI & Proxmox 6.2

DrillSgtErnst

Active Member
Jun 29, 2020
91
6
28
Hello Fellas,
Since there are many many many posts about people trying to get cheap Mellanox ConnectX VPI 40GB/s working on Proxmox and I really had a hard time getting mine to work, I am willing to share my Insights, and hope to make some peoples days.

At First. I am running a cluster of 4 Supermicro X9 Superservers with 2 E5 CPUs each. there are 4 Micron 5200MAX bound to an HBA for Ceph. And yeah. It is working great so far.
Please note: I am using an Voltaire VLT-4036 Infiniband switch. Therefore I did not have to configure opensm on one or more nodes of the cluster, but on the switch itself.

So what did I do?

Firstly I downlaoded and dd-ed the proxmox iso.
I installed normal Promox on a single xfs SSD-Drive. I don't think this is relevant, but just to be sure I am metioning it.

I ran
lscpi | grep -i Mell
to be sure the Mellanox Cards are visible. and it prompted me my card.


I am running my setup behind an HTTP-Proxy (squid), therefore my next Steps aren't necessarily needed for you. NTP has no passthrough either. Becaus of that I put them in a Spoiler.
nano /etc/environment

#HTTP:
http_proxy=http://10.100.3.210:8080
#HTTPS:
https_proxy=https://10.100.3.210:8080
#FTP:
ftp_proxy=ftp://10.100.3.210:2121

no_proxy=localhost,*.contonso.local;yourlocalsubnet/Your Subnetmask
soap_use_proxy=on


Buuuut apt will not respect that, so we have to configure apt itself
nano /etc/apt/apt.conf.d/70debconf

Acquire::http
{
Proxy "http://Proxy-IP:proxy-Port";
}


Yeah and I don't give way for NTP for security reasons. So my Firewall does it

nano /etc/systemd/timesyncd.conf

NTP=NTP-Server IP (e.g. local Firewall)

systemctl restart systemd-timesyncd

I added the Proxmox Repo, pve-no-subscription for Test purposes only
nano /etc/apt/sources.list # NOT recommended for production use deb http://download.proxmox.com/debian/pve buster pve-no-subscription

Installing Updates kinda generic, but also updating pve-headers, rebooting and after startup starting the IP over Infiniband Service
apt update apt dist-upgrade -y apt install -y pve-headers reboot modprobe ib_ipoib

So then I have to add the network Interfaces manually, for Proxmox doesn't recognize them automatically.
I need the internal adapter name from Linux for this
looking up Interface names ip -a The List shows ibp2s0 ibp2S0d1 adding them to the Interface List nano /etc/network/interfaces auto ibp2s0 iface ibp2s0 inet static address 10.99.0.1 (Ceph Cluster network IP) netmask 255.255.255.0 pre-up modprobe ib_ipoib pre-up echo connected > /sys/class/net/ibp2s0/mode mtu 65520 auto ibp2s0d1 iface ibp2s0d1 inet static address 10.98.0.1 (Ceph Public network IP) netmask 255.255.255.0 pre-up modprobe ib_ipoib pre-up echo connected > /sys/class/net/ibp2s0d1/mode mtu 65520 bringing up the adapters ifup ibp2s0 ifup ibp2s0d1
Well to my surprise the interfaces were named exactly alike on all hosts, so the manual is universal in my case.

Now the Adapters do show up as "unkown" Connection Type in the GUI.

The Adapters do survive reboots.

I tried the first test with Pings. well don't di that.
Just stubbornly configure Ceph.
For me it worked that way.


Well that ends my little TED-Talk.

Hope I could help some Proxmox Beginners like me with my verbose explanations.
 
Hi,

thank you for your contribution

Only some side nodes.

Proxmox VE does not support IPoIB. If you have three nodes, it is better to use a full-mesh network instead of a switched one. Also, five nodes are practicable. [1]

The Mellanox NICs work in IP mode out of the box and no extra setup are needed.

With IPoIB you lose about 50% bandwidth and increase the latency. Whit a full-mesh, you use the full speed of the nic and cut out the extra latency produced by the switch.

Also, update the Mellanox NIC firmware to ensure there is full driver compatibility.[2]

1.) https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server
2.) https://www.mellanox.com/support/firmware/firmware-downloads
 
  • Like
Reactions: DrillSgtErnst
Thank you for your reply.

I did some reading on Full Mesh, but I wanted to start with 4 Nodes, but be able to set some more afterwards.

I really liked a version with switch and I think the switch itself will not be too much of a handbrake.

The NIC itself only does VPI, since it is an ConnectX VPI of the first Gen. MHQH29-XTC.

The reason I wanted to share this is, I got the Voltasire Switch for 200€, the VPI Cards 39€ each. I think thats reasonable, even if I really just get 20GBit/s out of it.


But thanks for that reply, I will further get some Mellanox cards with Ethernet (I got an cheap eth Switch, too) and set it up in eth Mode, and share my Benches with you later.
 
Hi.

I configured infinband cards as in this post describe and they show up on "ip -a", but i can´t ping my nodes.

All nodes are connected to a infiniband switch.
What do i have to configure on the switch in order to make it work?

Thanks!
 
Well that's 1,5 years old. Please make a new thread. I am currently starting a new project with VPI Cards on PVE7 (This thread references PVE6), Maybe then I will have some more insights.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!