Connect-x 3 ROCe Configuration

sirebral

Member
Feb 12, 2022
49
10
13
Oregon, USA
Hey all!

I've been looking for a step by step to configure RoCE on the latest Proxmox. I am using Connectx-3 dual port cards and an interconnect cable that supports both InfiniBand and Ethernet natively, no switch involved. Likewise, I have is set up for Ethernet on both ends, and it works fine for both 40 gig connections, yet I want to take advantage of the capability to skip the CPU for my shares between nodes. These shares will only be used for files, not for virtual instances, lots of media. I'll be testing out both NFS 4.2 and SMB after this is functional to see what works best on my rigs.

My Google-foo hasn't turned up much, I did get the instructions below from ChatGPT4, yet I tend not to trust AI quite yet to be 100% correct. I'm looking for a similar, yet confirmed step-by-step. After looking for hours at 50 different sites, I found 50 different answers. I'd really appreciate an assist, so I'm not further messing up my configs trying inaccurate how-to's.

From ChatGPT4:

//-------

RoCE (RDMA over Converged Ethernet) allows for efficient communication between nodes in a network using Remote Direct Memory Access (RDMA) over Ethernet. In order to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch, you will need to follow these steps:

1. Install the necessary drivers:
- ConnectX-3 cards require the `mlx4_en` driver to be installed. You can install it using the following command:

```
apt-get install -y linux-modules-extra-$(uname -r) mlx4_en
```

- Verify that the driver has been loaded using the following command:

```
lsmod | grep mlx4_en
```

2. Configure the network interface:
- Identify the network interface that corresponds to the ConnectX-3 card using the following command:

```
ip a
```

- Edit the network interface configuration file `/etc/network/interfaces` and add the following lines to configure the interface:

```
auto enp6s0f0
iface enp6s0f0 inet static
address <IP_ADDRESS>
netmask <NETMASK>
```

Replace `enp6s0f0` with the name of the network interface identified in the previous step, and replace `<IP_ADDRESS>` and `<NETMASK>` with the desired IP address and netmask for the interface.

- Restart the networking service using the following command:

```
systemctl restart networking
```

3. Configure RoCE:
- Edit the RoCE configuration file `/etc/modprobe.d/mlx4.conf` and add the following line:

```
options mlx4_core roce_mode=1
```

- Restart the `rdma` service using the following command:

```
systemctl restart rdma
```

4. Verify the RoCE connection:
- Use the `ibstat` command to verify that the ConnectX-3 card is connected to the RoCE network:

```
ibstat
```

The output should show a RoCE interface with a valid state and link layer.

- Use the `ibping` command to verify that the RoCE connection is working:

```
ibping -c 10 <IP_ADDRESS>
```

Replace `<IP_ADDRESS>` with the IP address of another node in the RoCE network. The output should show successful pings with low latency and jitter.

By following these steps, you should be able to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch.

------\\

Thanks in advance for the help!

Keith
 
I hadn't tried this one, yet it is missing steps, hence why I asked for help. I played with ib today with little success, going back to RoCE, much more in my wheelhouse.
 
There's an issue with RoCE, it's not an option for the Connectx-3 cards. NVIDIA has stopped including it in their newer builds. The latest build that does support it, referred to as LTS, supports Ubuntu 20.04 and won't install. They promised a refresh, yet it's about a year late. So, if you're looking at this card note it will work well as a cheap interconnect at 40 gig, yet you're not (at the moment) going to get the advantage of RDMA/RoCE.
 
There's an issue with RoCE, it's not an option for the Connectx-3 cards. NVIDIA has stopped including it in their newer builds. The latest build that does support it, referred to as LTS, supports Ubuntu 20.04 and won't install. They promised a refresh, yet it's about a year late. So, if you're looking at this card note it will work well as a cheap interconnect at 40 gig, yet you're not (at the moment) going to get the advantage of RDMA/RoCE.

Hi, How about Connectx-3 PRO cards? I take that is still O.K?
As far as I know the standard ones could do RoCE and PRO support RoCEv2.
What are the options? As I have the PRO and at the moment looking to start using ROCE and NVMe/RDMA or TPC.
Thanks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!