Hardware Support

CTCcloud

Renowned Member
Apr 6, 2012
153
25
83
We have been grappling with getting our Hyperconverged Ceph cluster moved over from 10Gb copper to 25Gb optical fiber for months now. We have been entirely unsuccessfully to this point. Is there a hardware compatibility list to find out about whether the Network cards for 25Gb we are trying to use are compatible with Proxmox 6.x?

Is it possible the Cards we are using will be compatible on Proxmox 7.x with the newer kernel?

The cards we are using for 25Gb are as follows:

Nvidia chipset based - Mellanox Connect X4 25Gb MCX-4121A-ACAT

these have not worked for us at all so far and we have found a bug in the Proxmox networking stack with them where using ifupdown2 IP addresses assigned are not removed from the previous 10Gb and so the same addresses end up on both 10Gb and 25Gb after clicking on "Apply configuration" in the Proxmox GUI

We also purchased Intel 25Gb fiber NICs to test and those seem to work but we would like to know if they are officially proven to work for either Proxmox 6.x or 7.x or if both even well support these NICs.

The Intel NICs are as follows:

Intel XXV710 Dual Port 25GbE SFP28/SFP+ PCIe Adapter

These were able to install in 2 nodes and copy a VM backup between them over 25Gb as a test .. this was over SCP so wasn't particularly fast but never-the-less, worked.

Can someone from Proxmox please confirm that one of these models is indeed well supported and if there is something extra needed to be done to make the Mellanox cards work properly or if we CAN use Mellanox but that a different model card is supported vs the above mentioned model?

Thanks in advance,

CTC
 
Mellanox cards usually work well with Linux. If there is some issue with a newer kernel, the first step you should try is to update the firmware of the NICs.
This is very easy with the mlxup tool provided by Mellanox. You can download it from their website: https://www.mellanox.com/support/firmware/mlxup-mft The Linux -> x64 variant.

Make the file executable: chmod +x mlxup and then run it with ./mlxup. It will automatically detect any Mellanox NIC, check the firmware version installed and prompt you if there is a newer one available to download and flash.

While not exactly the same model, we do have the following ones in use here ourselves:

Connect X4:
MCX456A-ECA_Ax (100Gbit)
MCX4421A-ACQ_Ax (25Gbit)

Connect X5:
MCX556A-ECA_Ax (100Gbit)

The part numbers as reported by the mlxup CLI utility.
 
That's not exactly what I asked, but thank you Aaron

We had a network pro test the Mellanox cards on Debian Linux and they worked well .. once installed in Proxmox, they did not. That doesn't sound like a need for a firmware upgrade. We also know that Proxmox switched to using the Ubuntu kernel years back although the base system is Debian. Is it possible there is an issue with these Mellanox cards and the Ubuntu kernels?

Bear in mind as I explained earlier, when using the Intel based NICs things worked instantly ...

As I said we've been working on switching over to 25Gb for months now and this has turned into a nightmare. Obviously, we can't switch over to a networking technology that is not entirely stable because this cluster has been in production for years and has grown significantly so we can't have things just break and have the customers left hanging for obvious reasons. We need to be sure that the hardware we use is well supported on Proxmox not just Linux in general.

Many distributions compile their kernels with different options and different drivers both as separate modules and as part of the main kernel compiled binary. Since this is the case, you get different behavior with different distributions. I have personally used more than a dozen different Linux distros and can tell you that hardware is definitely not treated the same on each .. whether properly recognized or not, properly installed or not, or properly configured or not.

What we are looking for is some confirmation that the cards we've chosen are "known good" or not. I will try out the firmware update utility you mentioned as that may still prove useful. ANY extra guidance here would be very well received and MUCH appreciated.

Thanks in advance
 
Hello,
I think Aaron is right.
I often have problems on the network, especially at 10GBit or more, when firmware and drivers do not match.
I suspect the Debian test was with an older kernel and probably an older driver.

In my experience, the Mellanox cards are the best at 25GBit or faster and run the most stable. Please test the firmware update.

greetings
Falk
 
Please don't get me wrong .. I'm not trying to say Aaron is wrong .. just trying to get confirmation of solid hardware compatibility

We've spent months and countless hours trying to get Ceph switched over to 25Gb fiber vs 10Gb copper to no avail ... so was trying to make 100% sure that the Mellanox cards we are using are 100% compatible and that we aren't chasing our tails on something is likely never going to be what we expect

Thanks for the input guys
 
By the way, I did indeed do the firmware update on all cards installed .. on one server we removed the Mellanox card and installed an Intel card for testing purposes but otherwise, all other servers have had the Mellanox firmware updated and the server was rebooted just yesterday 8/17/2021

The Mellanox card do seem to be working at this point .. I was able to "SCP" a large file from one server to another over the 25Gb network
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!