Hello Everyone,
I'd just like to post a general query about the implementation of RoCE (RDMA Over Converged Ethernet) for Proxmox PVE.
I have a 4 Node cluster running version 9, ceph enabled and each node has 5 SSD drives.
I then have two networks, one for general traffic with breakout (Named KVM) and another specifically for storage related tasks (Named STO)
The ceph configuration is set up with the private and public network logic, so KVM Network is public and then it has its own backend Storage Network.
I then have quite a few VMs running with the ceph backed storage pool, I've tested all the migrate functionalities and it works great !
I then endeavoured to make ceph RoCE aware, I've heard that this could help even further with performance when it comes to migrate tasks.
Each node has 2x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) so a total of 4 ports, and these are cross bonded:
bond0 (KVM Network):
NIC A Port 1 + NIC B Port 1
And so on.
I understand that there needs to be a special driver in place in order to use the RDMA Functionality with these cards, so I decided to do this before even making ceph roce aware.
I obtained a driver from broadcom, and ran the installation script provided (install.sh)
Unfortunately, this basically took out the proxmox node and eventually I had to remove these drivers and switch the kernel devices back to what they were before getting it back online again. The installation script does warm that the OS environment is not supported, but I figured the backend is debian and that's supported so what could go wrong ? Oh boy was i mistaken
I used these links for the drivers install:
https://techdocs.broadcom.com/us/en...ce-drivers-using-the-automated-installer.html
https://www.broadcom.com/products/ethernet-connectivity/network-adapters/bcm57414-50g-ic
Just an open ended question about all of this, is this actually supported in proxmox ?
If I got it to work, would it persist with future node upgrades ?
I'd just like to post a general query about the implementation of RoCE (RDMA Over Converged Ethernet) for Proxmox PVE.
I have a 4 Node cluster running version 9, ceph enabled and each node has 5 SSD drives.
I then have two networks, one for general traffic with breakout (Named KVM) and another specifically for storage related tasks (Named STO)
The ceph configuration is set up with the private and public network logic, so KVM Network is public and then it has its own backend Storage Network.
I then have quite a few VMs running with the ceph backed storage pool, I've tested all the migrate functionalities and it works great !
I then endeavoured to make ceph RoCE aware, I've heard that this could help even further with performance when it comes to migrate tasks.
Each node has 2x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) so a total of 4 ports, and these are cross bonded:
bond0 (KVM Network):
NIC A Port 1 + NIC B Port 1
And so on.
I understand that there needs to be a special driver in place in order to use the RDMA Functionality with these cards, so I decided to do this before even making ceph roce aware.
I obtained a driver from broadcom, and ran the installation script provided (install.sh)
Unfortunately, this basically took out the proxmox node and eventually I had to remove these drivers and switch the kernel devices back to what they were before getting it back online again. The installation script does warm that the OS environment is not supported, but I figured the backend is debian and that's supported so what could go wrong ? Oh boy was i mistaken
I used these links for the drivers install:
https://techdocs.broadcom.com/us/en...ce-drivers-using-the-automated-installer.html
https://www.broadcom.com/products/ethernet-connectivity/network-adapters/bcm57414-50g-ic
Just an open ended question about all of this, is this actually supported in proxmox ?
If I got it to work, would it persist with future node upgrades ?