[SOLVED] Problem with Mellanox MCX556A-ECAT on PVE 6.3

Sycoriorz

Well-Known Member
Mar 19, 2018
45
4
48
36
Hi at all,

i have setup servers MCX556A-ECAT but it not run out of the box.
i can not see the cards in networks.
when i run follows i get that response

root@pve1:/tools# dmesg |grep -e mlx
[ 4.836995] mlx5_core 0000:41:00.0: firmware version: 16.23.1020
[ 4.837033] mlx5_core 0000:41:00.0: 63.008 Gb/s available PCIe bandwidth, limited by 8 GT/s x8 link at 0000:40:01.1 (capable of 126.016 Gb/s with 8 GT/s x16 link)
[ 5.355487] mlx5_core 0000:41:00.0: Port module event: module 0, Cable unplugged
[ 5.369784] mlx5_core 0000:41:00.1: firmware version: 16.23.1020
[ 5.369845] mlx5_core 0000:41:00.1: 63.008 Gb/s available PCIe bandwidth, limited by 8 GT/s x8 link at 0000:40:01.1 (capable of 126.016 Gb/s with 8 GT/s x16 link)
[ 5.894086] mlx5_core 0000:41:00.1: Port module event: module 1, Cable unplugged
[ 5.908352] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
[ 5.943216] mlx5_core 0000:41:00.0: cmd_work_handler:887:(pid 863): failed to allocate command entry
[ 5.943229] infiniband mlx5_0: reg_mr_callback:104:(pid 863): async reg mr failed. status -11
[ 5.982186] mlx5_core 0000:41:00.1: cmd_work_handler:887:(pid 865): failed to allocate command entry
[ 5.982495] infiniband mlx5_1: reg_mr_callback:104:(pid 865): async reg mr failed. status -11


if i try to start an fw update i get that

./mlxup -online
Querying Mellanox devices firmware ...

Device #1:
----------

Device Type: ConnectX5
Part Number: N/A
Description:
PSID: HPE0000000009
PCI Device Name: 0000:41:00.0
Base GUID: b88303ffff7c1e30
Base MAC: b883037c1e30
Versions: Current Available
FW 16.23.1020 N/A
PXE 3.5.0504 N/A
UEFI 14.16.0017 N/A

Status: No matching image found

root@pve1:/tools# lspci -v | grep Mellanox
41:00.0 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]
41:00.1 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]

When i try to install the OFED of mellanox itself i must uninstall pve required things

./mlnxofedinstall --skip-distro-check --without-depcheck --force
Logs dir: /tmp/MLNX_OFED_LINUX.25969.logs
General log file: /tmp/MLNX_OFED_LINUX.25969.logs/general.log

Below is the list of MLNX_OFED_LINUX packages that you have chosen
(some may have been added by the installer due to package dependencies):

ofed-scripts
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-dkms
iser-dkms
isert-dkms
srp-dkms
rdma-core
libibverbs1
ibverbs-utils
ibverbs-providers
libibverbs-dev
libibverbs1-dbg
libibumad3
libibumad-dev
ibacm
librdmacm1
rdmacm-utils
librdmacm-dev
mstflint
ibdump
libibmad5
libibmad-dev
libopensm
opensm
opensm-doc
libopensm-devel
libibnetdisc5
infiniband-diags
mft
kernel-mft-dkms
perftest
ibutils2
ar-mgr
dump-pr
ibsim
ibsim-doc
ucx
sharp
hcoll
knem-dkms
knem
openmpi
mpitests
libdapl2
dapl2-utils
libdapl-dev
dpcp
srptools
mlnx-ethtool
mlnx-iproute2
neohost-backend
neohost-sdk

This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

Removing old packages...
Failed command: apt-get remove -y libibverbs1 proxmox-ve pve-manager spiceterm qemu-server pve-ha-manager pve-container pve-qemu-kvm libiscsi7 libpve-guest-common-perl libpve-storage-perl glusterfs-client glusterfs-common librdmacm1 ceph-common python-cephfs python-rbd python-rados librbd1 libradosstriper1 librados2-perl libcephfs2 librados2 ceph-fuse

if i run commands with ignoring i get this

root@pve1:/tools/sw# ./mlnxofedinstall --without-dkms --add-kernel-support --kernel 5.4.73-1-pve --without-fw-update --force
Provide path to the kernel sources for 5.4.73-1-pve kernel.
what is the path to kernel source?


has someone an idea what i do wrong.
Here i am with my knowledge on the end.

many thanks for help

best regards

Thomas
 
UPDATE

i have made an update over an microsoft pc to the newest FW of HP.

At the End the problem was that the driver was set to IB.
I have changed to ETH.
Since then the cards was in PVE visible as usual card.
 
  • Like
Reactions: Stoiko Ivanov
Thanks for reporting your solution and marking the thread as 'SOLVED'