Hello Fellas,
Since there are many many many posts about people trying to get cheap Mellanox ConnectX VPI 40GB/s working on Proxmox and I really had a hard time getting mine to work, I am willing to share my Insights, and hope to make some peoples days.
At First. I am running a cluster of 4 Supermicro X9 Superservers with 2 E5 CPUs each. there are 4 Micron 5200MAX bound to an HBA for Ceph. And yeah. It is working great so far.
Please note: I am using an Voltaire VLT-4036 Infiniband switch. Therefore I did not have to configure opensm on one or more nodes of the cluster, but on the switch itself.
So what did I do?
Firstly I downlaoded and dd-ed the proxmox iso.
I installed normal Promox on a single xfs SSD-Drive. I don't think this is relevant, but just to be sure I am metioning it.
I ran
to be sure the Mellanox Cards are visible. and it prompted me my card.
I am running my setup behind an HTTP-Proxy (squid), therefore my next Steps aren't necessarily needed for you. NTP has no passthrough either. Becaus of that I put them in a Spoiler.
I added the Proxmox Repo, pve-no-subscription for Test purposes only
Installing Updates kinda generic, but also updating pve-headers, rebooting and after startup starting the IP over Infiniband Service
So then I have to add the network Interfaces manually, for Proxmox doesn't recognize them automatically.
I need the internal adapter name from Linux for this
Well to my surprise the interfaces were named exactly alike on all hosts, so the manual is universal in my case.
Now the Adapters do show up as "unkown" Connection Type in the GUI.
The Adapters do survive reboots.
I tried the first test with Pings. well don't di that.
Just stubbornly configure Ceph.
For me it worked that way.
Well that ends my little TED-Talk.
Hope I could help some Proxmox Beginners like me with my verbose explanations.
Since there are many many many posts about people trying to get cheap Mellanox ConnectX VPI 40GB/s working on Proxmox and I really had a hard time getting mine to work, I am willing to share my Insights, and hope to make some peoples days.
At First. I am running a cluster of 4 Supermicro X9 Superservers with 2 E5 CPUs each. there are 4 Micron 5200MAX bound to an HBA for Ceph. And yeah. It is working great so far.
Please note: I am using an Voltaire VLT-4036 Infiniband switch. Therefore I did not have to configure opensm on one or more nodes of the cluster, but on the switch itself.
So what did I do?
Firstly I downlaoded and dd-ed the proxmox iso.
I installed normal Promox on a single xfs SSD-Drive. I don't think this is relevant, but just to be sure I am metioning it.
I ran
lscpi | grep -i Mell
to be sure the Mellanox Cards are visible. and it prompted me my card.
I am running my setup behind an HTTP-Proxy (squid), therefore my next Steps aren't necessarily needed for you. NTP has no passthrough either. Becaus of that I put them in a Spoiler.
nano /etc/environment
#HTTP:
http_proxy=http://10.100.3.210:8080
#HTTPS:
https_proxy=https://10.100.3.210:8080
#FTP:
ftp_proxy=ftp://10.100.3.210:2121
no_proxy=localhost,*.contonso.local;yourlocalsubnet/Your Subnetmask
soap_use_proxy=on
Buuuut apt will not respect that, so we have to configure apt itself
nano /etc/apt/apt.conf.d/70debconf
Acquire::http
{
Proxy "http://Proxy-IProxy-Port";
}
Yeah and I don't give way for NTP for security reasons. So my Firewall does it
nano /etc/systemd/timesyncd.conf
NTP=NTP-Server IP (e.g. local Firewall)
systemctl restart systemd-timesyncd
#HTTP:
http_proxy=http://10.100.3.210:8080
#HTTPS:
https_proxy=https://10.100.3.210:8080
#FTP:
ftp_proxy=ftp://10.100.3.210:2121
no_proxy=localhost,*.contonso.local;yourlocalsubnet/Your Subnetmask
soap_use_proxy=on
Buuuut apt will not respect that, so we have to configure apt itself
nano /etc/apt/apt.conf.d/70debconf
Acquire::http
{
Proxy "http://Proxy-IProxy-Port";
}
Yeah and I don't give way for NTP for security reasons. So my Firewall does it
nano /etc/systemd/timesyncd.conf
NTP=NTP-Server IP (e.g. local Firewall)
systemctl restart systemd-timesyncd
I added the Proxmox Repo, pve-no-subscription for Test purposes only
nano /etc/apt/sources.list
# NOT recommended for production use
deb http://download.proxmox.com/debian/pve buster pve-no-subscription
Installing Updates kinda generic, but also updating pve-headers, rebooting and after startup starting the IP over Infiniband Service
apt update
apt dist-upgrade -y
apt install -y pve-headers
reboot
modprobe ib_ipoib
So then I have to add the network Interfaces manually, for Proxmox doesn't recognize them automatically.
I need the internal adapter name from Linux for this
looking up Interface names
ip -a
The List shows
ibp2s0
ibp2S0d1
adding them to the Interface List
nano /etc/network/interfaces
auto ibp2s0
iface ibp2s0 inet static
address 10.99.0.1 (Ceph Cluster network IP)
netmask 255.255.255.0
pre-up modprobe ib_ipoib
pre-up echo connected > /sys/class/net/ibp2s0/mode
mtu 65520
auto ibp2s0d1
iface ibp2s0d1 inet static
address 10.98.0.1 (Ceph Public network IP)
netmask 255.255.255.0
pre-up modprobe ib_ipoib
pre-up echo connected > /sys/class/net/ibp2s0d1/mode
mtu 65520
bringing up the adapters
ifup ibp2s0
ifup ibp2s0d1
Well to my surprise the interfaces were named exactly alike on all hosts, so the manual is universal in my case.
Now the Adapters do show up as "unkown" Connection Type in the GUI.
The Adapters do survive reboots.
I tried the first test with Pings. well don't di that.
Just stubbornly configure Ceph.
For me it worked that way.
Well that ends my little TED-Talk.
Hope I could help some Proxmox Beginners like me with my verbose explanations.