Hey all!
I've been looking for a step by step to configure RoCE on the latest Proxmox. I am using Connectx-3 dual port cards and an interconnect cable that supports both InfiniBand and Ethernet natively, no switch involved. Likewise, I have is set up for Ethernet on both ends, and it works fine for both 40 gig connections, yet I want to take advantage of the capability to skip the CPU for my shares between nodes. These shares will only be used for files, not for virtual instances, lots of media. I'll be testing out both NFS 4.2 and SMB after this is functional to see what works best on my rigs.
My Google-foo hasn't turned up much, I did get the instructions below from ChatGPT4, yet I tend not to trust AI quite yet to be 100% correct. I'm looking for a similar, yet confirmed step-by-step. After looking for hours at 50 different sites, I found 50 different answers. I'd really appreciate an assist, so I'm not further messing up my configs trying inaccurate how-to's.
From ChatGPT4:
//-------
RoCE (RDMA over Converged Ethernet) allows for efficient communication between nodes in a network using Remote Direct Memory Access (RDMA) over Ethernet. In order to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch, you will need to follow these steps:
1. Install the necessary drivers:
- ConnectX-3 cards require the `mlx4_en` driver to be installed. You can install it using the following command:
```
apt-get install -y linux-modules-extra-$(uname -r) mlx4_en
```
- Verify that the driver has been loaded using the following command:
```
lsmod | grep mlx4_en
```
2. Configure the network interface:
- Identify the network interface that corresponds to the ConnectX-3 card using the following command:
```
ip a
```
- Edit the network interface configuration file `/etc/network/interfaces` and add the following lines to configure the interface:
```
auto enp6s0f0
iface enp6s0f0 inet static
address <IP_ADDRESS>
netmask <NETMASK>
```
Replace `enp6s0f0` with the name of the network interface identified in the previous step, and replace `<IP_ADDRESS>` and `<NETMASK>` with the desired IP address and netmask for the interface.
- Restart the networking service using the following command:
```
systemctl restart networking
```
3. Configure RoCE:
- Edit the RoCE configuration file `/etc/modprobe.d/mlx4.conf` and add the following line:
```
options mlx4_core roce_mode=1
```
- Restart the `rdma` service using the following command:
```
systemctl restart rdma
```
4. Verify the RoCE connection:
- Use the `ibstat` command to verify that the ConnectX-3 card is connected to the RoCE network:
```
ibstat
```
The output should show a RoCE interface with a valid state and link layer.
- Use the `ibping` command to verify that the RoCE connection is working:
```
ibping -c 10 <IP_ADDRESS>
```
Replace `<IP_ADDRESS>` with the IP address of another node in the RoCE network. The output should show successful pings with low latency and jitter.
By following these steps, you should be able to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch.
------\\
Thanks in advance for the help!
Keith
I've been looking for a step by step to configure RoCE on the latest Proxmox. I am using Connectx-3 dual port cards and an interconnect cable that supports both InfiniBand and Ethernet natively, no switch involved. Likewise, I have is set up for Ethernet on both ends, and it works fine for both 40 gig connections, yet I want to take advantage of the capability to skip the CPU for my shares between nodes. These shares will only be used for files, not for virtual instances, lots of media. I'll be testing out both NFS 4.2 and SMB after this is functional to see what works best on my rigs.
My Google-foo hasn't turned up much, I did get the instructions below from ChatGPT4, yet I tend not to trust AI quite yet to be 100% correct. I'm looking for a similar, yet confirmed step-by-step. After looking for hours at 50 different sites, I found 50 different answers. I'd really appreciate an assist, so I'm not further messing up my configs trying inaccurate how-to's.
From ChatGPT4:
//-------
RoCE (RDMA over Converged Ethernet) allows for efficient communication between nodes in a network using Remote Direct Memory Access (RDMA) over Ethernet. In order to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch, you will need to follow these steps:
1. Install the necessary drivers:
- ConnectX-3 cards require the `mlx4_en` driver to be installed. You can install it using the following command:
```
apt-get install -y linux-modules-extra-$(uname -r) mlx4_en
```
- Verify that the driver has been loaded using the following command:
```
lsmod | grep mlx4_en
```
2. Configure the network interface:
- Identify the network interface that corresponds to the ConnectX-3 card using the following command:
```
ip a
```
- Edit the network interface configuration file `/etc/network/interfaces` and add the following lines to configure the interface:
```
auto enp6s0f0
iface enp6s0f0 inet static
address <IP_ADDRESS>
netmask <NETMASK>
```
Replace `enp6s0f0` with the name of the network interface identified in the previous step, and replace `<IP_ADDRESS>` and `<NETMASK>` with the desired IP address and netmask for the interface.
- Restart the networking service using the following command:
```
systemctl restart networking
```
3. Configure RoCE:
- Edit the RoCE configuration file `/etc/modprobe.d/mlx4.conf` and add the following line:
```
options mlx4_core roce_mode=1
```
- Restart the `rdma` service using the following command:
```
systemctl restart rdma
```
4. Verify the RoCE connection:
- Use the `ibstat` command to verify that the ConnectX-3 card is connected to the RoCE network:
```
ibstat
```
The output should show a RoCE interface with a valid state and link layer.
- Use the `ibping` command to verify that the RoCE connection is working:
```
ibping -c 10 <IP_ADDRESS>
```
Replace `<IP_ADDRESS>` with the IP address of another node in the RoCE network. The output should show successful pings with low latency and jitter.
By following these steps, you should be able to set up RoCE on Proxmox 7.4 using ConnectX-3 cards without a switch.
------\\
Thanks in advance for the help!
Keith