I recently purchased 2 Mellanox CX354As to install in my 2 node proxmox cluster. I installed them and the first one worked fine and the port showed up in network devices after attaching it to my switch. However, the second card in another server is not appearing in network devices. I tried installing mst tools and manually switching the port to ethernet mode but mlxconfig cannot query the card. As shown below.
I did some digging and I believe the issue is that the card does not have a kernal driver associated with it as seen bellow.
This is the output for the same command on the server with the working card.
This is the output for mlxup which is identical for both the working and not working cards.
Does anyone know if the missing kernel driver is the cause of my issues and how to fix the issue?
root@pve1:~# mlxconfig -d /dev/mst/mt4099_pciconf0 q
-E- Failed to open device: /dev/mst/mt4099_pciconf0. Cannot perform operation, Driver might be down.
I did some digging and I believe the issue is that the card does not have a kernal driver associated with it as seen bellow.
root@pve1:~# lspci -k
04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
Kernel modules: mlx4_core
05:00.0 VGA compatible controller:
This is the output for the same command on the server with the working card.
root@pve0:~# lspci -k
04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
Kernel driver in use: mlx4_core
Kernel modules: mlx4_core
This is the output for mlxup which is identical for both the working and not working cards.
root@pve1:~# ./mlxup
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX3
Part Number: MCX354A-FCB_A2-A5
Description: ConnectX-3 VPI adapter card; dual-port QSFP; FDR IB (56Gb/s) and 40GigE; PCIe3.0 x8 8GT/s; RoHS R6
PSID: MT_1090120019
PCI Device Name: 0000:04:00.0
Port1 MAC: 0010e088fdb5
Port2 MAC: 0010e088fdb6
Versions: Current Available
FW 2.42.5000 2.42.5000
PXE 3.4.0752 3.4.0752
Status: Up to date
Does anyone know if the missing kernel driver is the cause of my issues and how to fix the issue?