Ceph blustore over RDMA performance gain

elurex

Active Member
Oct 28, 2015
204
15
38
Taiwan
I want to share following testing with you

4 PVE Nodes cluster with 3 Ceph Bluestore Node, total of 36 OSD.
  1. OSD: st6000nm0034
  2. block.db & block.wal device: Samsung sm961 512GB
  3. NIC: Mellanox Connectx3 VPI dual port 40 Gbps
  4. Switch: Mellanox sx6036T
  5. Network: IPoIB separated public network & cluster network
This shows ceph over RDMA is successfully enabled
34278654_10215916886545588_4995711400983658496_n.jpg


Ceph over RDMA - rados bench -p rbd 60 write -b 4M -t 16
34135443_10215917444279531_1554597879899750400_o.jpg

2454.72 MB/s

Standard TCP/IP - rados bench -p rbd 60 write -b 4M -t 16
34276562_10215917451039700_6870683449976422400_o.jpg

2053.9 MB/s

Total performance gain is about 25%

Total pool performance with 4 tests running - rados bench -p rbd 60 write -b 4M -t 16
upload_2018-6-2_20-43-41.png
4856.72 MB/s
 
Last edited:
  • Like
Reactions: DerDanilo
I want to share following testing with you

4 PVE Nodes cluster with 3 Ceph Bluestore Node, total of 36 OSD.
  1. OSD: st6000nm0034
  2. block.db & block.wal device: Samsung sm961 512GB
  3. NIC: Mellanox Connectx3 VPI dual port 40 Gbps
  4. Switch: Mellanox sx6036T
  5. Network: IPoIB separated public network & cluster network
This shows ceph over RDMA is successfully enabled
34278654_10215916886545588_4995711400983658496_n.jpg


Ceph over RDMA - rados bench -p rbd 60 write -b 4M -t 16
34135443_10215917444279531_1554597879899750400_o.jpg

2454.72 MB/s

Standard TCP/IP - rados bench -p rbd 60 write -b 4M -t 16
34276562_10215917451039700_6870683449976422400_o.jpg

2053.9 MB/s

Total performance gain is about 25%

Total pool performance with 4 tests running - rados bench -p rbd 60 write -b 4M -t 16
View attachment 7571
4856.72 MB/s

amazing !

can you please report detailed configuration steps for PVE to accomplish this ?
I failed in getting this to run !

Regards Gerhard
 
Gerhard,

I thought you were the first one to do so. Here are my steps
  1. Download and install Mellanox Driver for debian 9.1, must use mellanox mlnx_add_kernel_support.sh to compile it for 9.4, go to DEBS and just dpkg -i install everything and run apt --fix-broken install later
  2. Follow https://community.mellanox.com/docs/DOC-2721
  3. skip 1~8 (but make sure to use udaddy to verify RDMA is running between nodes)
    stop all ceph service systemctl stop ceph\*.service ceph\*.target
  4. do #9 and must skip ms_async_rdma_local_gid
  5. skip #10 due to ceph.conf is part of corosync which you can't have unique ms_async_rdma_local_gid value for each nodes
  6. do #11 but skip ceph-radosgw@.service & ceph-mds@.service
  7. repeat #11 on all nodes
without ms_async_rdma_local_gid setting, I did found Ceph OSD nodes on PVE Nodes performance will only boost about 20% . If a pure pve nodes without any ceph local osd, the performance boost about 25%. So in order to run hyper convergence mode ceph osd and pve nodes on the same server, ms_async_rdma_local_gid is important. (without it,ceph over rdma will still copy local data to local server via network again, if not, it can be done just via kernel, can you imagine how much of performance boost?)

This will need Proxmox Team to figure out on how each pve node can have their own setting of ceph.conf to take full advantage of ceph over rdma. Moreover, it is also good to seperate ceph monitor nodes and ceph osd nodes. (currently I removed the ln -s /etc/pve/ceph.conf /etc/ceph/ceph.conf)
 
Last edited:
can you direct me to a download link for topic #1 ?
out of box pve version is this one ...

Code:
 strings /lib/modules/4.15.17-2-pve/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko|grep -i versio
(Installed FW version is %d.%d.%03d)
This driver version supports only revisions %d to %d
FW version %d.%d.%03d (cmd intf rev %d), max commands %d
slave driver version is not supported by the master
QUERY_FW command failed: could not get FW version
version=4.0-0
srcversion=AD37F2B7771A319AB21A508
vermagic=4.15.17-2-pve SMP mod_unload modversions
mlx4_version
__UNIQUE_ID_version83
__UNIQUE_ID_srcversion45
____versions
mlx4_comm_get_version
__versions

btw i prepared 4 ceph.conf files in /etc/pve and link /etc/pve/ceph.conf.pve01 to /etc/ceph/ceph.conf on pve01
and for all remaining 3 nodes similar.
so i have a centralized point in /etc/pve .... to change configs if necessary ... and each node has a valid ms_async_rdma_local_gid ...
 
I am not sure your mellanox nic is...
but for mellanox EN nic use the following
http://content.mellanox.com/ofed/MLNX_EN-4.3-1.0.1.0/mlnx-en-4.3-1.0.1.0-debian9.1-x86_64.tgz

for mellanox VPI use the following
http://content.mellanox.com/ofed/ML...X_OFED_LINUX-4.3-1.0.1.0-debian9.1-x86_64.tgz

if done all the steps and able to run all the service and you are able to see even the OSD service are running rdma, then you are all set
Code:
root@epyc3:~# ceph daemon osd.12 perf dump AsyncMessenger::RDMAWorker-1
{
    "AsyncMessenger::RDMAWorker-1": {
        "tx_no_mem": 0,
        "tx_parital_mem": 0,
        "tx_failed_post": 0,
        "rx_no_registered_mem": 0,
        "tx_chunks": 914,
        "tx_bytes": 1761623,
        "rx_chunks": 908,
        "rx_bytes": 1761937,
        "pending_sent_conns": 0
    }
}
 
connect x3 pro running in 56gBit/s mode ... sx1002 switch ... and appropriate cables... see also my personal profile footer and my original RDMA thread...


don't know how to proceed with this driver ...
you said invoke "mlnx_add_kernel_support.sh" and compile .... and installl all stuff in DEB.

would be nice to get exact steps to be taken ....

Code:
root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.1-x86_64# ./mlnx_add_kernel_support.sh

        Usage: mlnx_add_kernel_support.sh -m|--mlnx_ofed <path to MLNX_OFED/mlnx-en directory> [--make-iso|--make-tgz]

                [--make-iso]                                            Create MLNX_OFED/mlnx-en ISO image.
                [--make-tgz]                                            Create MLNX_OFED/mlnx-en tarball. (Default)
                [-t|--tmpdir <temp work dir>]                           Temp work directory (Default: /tmp)

                [-k | --kernel] <kernel version>                        Kernel version to use.
                [-s | --kernel-sources] <path to the kernel sources>    Path to kernel headers.
                [--ofed-sources] <path to tgz>                          Path to OFED sources tgz package.
                [-v|--verbose]
                [-n|--name]                                             Name of the package to be created.
                [-y|--yes]                                              Answer "yes" to all questions
                [--force]                                               Force removing packages that depends on MLNX_OFED/mlnx-en
                [--skip-repo]                                           Do not create a repository from MLNX_OFED/mlnx-en rpms.
                [--without-<package>]                                   Do not build/install given package (or module).



root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.1-x86_64# uname -a
Linux pve01 4.15.17-2-pve #1 SMP PVE 4.15.17-10 (Tue, 22 May 2018 11:15:44 +0200) x86_64 GNU/Linux
 
thanks ... but some warnings ... shall i ignore them or specify '--skip-distro-check' ?


also which firmware do you have ?
Code:
# ibv_devinfo
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.40.7000
        node_guid:                      248a:0703:00e2:6070
        sys_image_guid:                 248a:0703:00e2:6070
        vendor_id:                      0x02c9
        vendor_part_id:                 4103
        hw_ver:                         0x0
        board_id:                       MT_1090111023
        phys_port_cnt:                  2
Code:
root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.1-x86_64# ./mlnx_add_kernel_support.sh -m ./ --make-tgz
Note: This program will create mlnx-en TGZ for debian9.4 under /tmp directory.
Do you want to continue?[y/N]:y
See log file /tmp/mlnx_ofed_iso.745680.log

WARNING: The current mlnx-en is intended for debian9.1 !
You may need to use the '--skip-distro-check' flag to install the resulting mlnx-en on this system.

Checking if all needed packages are installed...
Building mlnx-en DEBS . Please wait...
 
After making melanox Driver.. exact steps ?
Unpack and Install ?
Would realy bei Microsoft top have a working Cook book :)
 
OK will try later on....
Hi

just updated firmware ...

the build process made a 9.4 ???
Code:
cd /usr/local/src
wget "http://content.mellanox.com/ofed/MLNX_EN-4.3-1.0.1.0/mlnx-en-4.3-1.0.1.0-debian9.1-x86_64.tgz"
tar -xzvf mlnx-en-4.3-1.0.1.0-debian9.1-x86_64.tgz
cd mlnx-en-4.3-1.0.1.0-debian9.1-x86_64/
./mlnx_add_kernel_support.sh -m ./ --make-tgz

invoking install out of newly build driver:

Code:
root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.4-x86_64-ext# ./install
Error: The current mlnx-en is intended for debian9.1


root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.4-x86_64-ext# ./install --skip-distro-check
Logs dir: /tmp/mlnx-en.11881.logs
General log file: /tmp/mlnx-en.11881.logs/general.log

Below is the list of mlnx-en packages that you have chosen
(some may have been added by the installer due to package dependencies):

ofed-scripts
mlnx-en-utils
mlnx-en-modules
mstflint

This program will install the mlnx-en package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with mlnx-en, do not reinstall them.

Do you want to continue?[y/N]:n

I'm not sure if i'm ok to answer "y"

How did you accomplish this task ?

Regards Gerhard
 
I do not use the mlnx_install script...

please go to DEBS folder and manually install all debs

dpkg -i *.debs
then
apt --fix-broken install
thows many errors/warnings !
Code:
root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.4-x86_64-ext/DEBS# dpkg -i *.deb
(Reading database ... 236351 files and directories currently installed.)
Preparing to unpack ibverbs-utils_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking ibverbs-utils (41mlnx1-OFED.4.3.0.1.8.43101) over (1.2.1-2) ...
Preparing to unpack libibverbs1_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking libibverbs1 (41mlnx1-OFED.4.3.0.1.8.43101) over (1.2.1-2) ...
Selecting previously unselected package libibverbs1-dbg.
Preparing to unpack libibverbs1-dbg_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking libibverbs1-dbg (41mlnx1-OFED.4.3.0.1.8.43101) ...
Selecting previously unselected package libibverbs-dev.
Preparing to unpack libibverbs-dev_41mlnx1-OFED.4.3.0.1.8.43101_amd64.deb ...
Unpacking libibverbs-dev (41mlnx1-OFED.4.3.0.1.8.43101) ...
Preparing to unpack libmlx4-1_41mlnx1-OFED.4.1.0.1.0.43101_amd64.deb ...
Unpacking libmlx4-1 (41mlnx1-OFED.4.1.0.1.0.43101) over (1.2.1-1) ...
Selecting previously unselected package libmlx4-1-dbg.
Preparing to unpack libmlx4-1-dbg_41mlnx1-OFED.4.1.0.1.0.43101_amd64.deb ...
Unpacking libmlx4-1-dbg (41mlnx1-OFED.4.1.0.1.0.43101) ...
Selecting previously unselected package libmlx5-1.
Preparing to unpack libmlx5-1_41mlnx1-OFED.4.3.0.2.1.43101_amd64.deb ...
Unpacking libmlx5-1 (41mlnx1-OFED.4.3.0.2.1.43101) ...
Selecting previously unselected package libmlx5-1-dbg.
Preparing to unpack libmlx5-1-dbg_41mlnx1-OFED.4.3.0.2.1.43101_amd64.deb ...
Unpacking libmlx5-1-dbg (41mlnx1-OFED.4.3.0.2.1.43101) ...
Selecting previously unselected package libmlx5-dev.
Preparing to unpack libmlx5-dev_41mlnx1-OFED.4.3.0.2.1.43101_amd64.deb ...
Unpacking libmlx5-dev (41mlnx1-OFED.4.3.0.2.1.43101) ...
Preparing to unpack librdmacm1_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb ...
Unpacking librdmacm1 (41mlnx1-OFED.4.2.0.1.3.43101) over (1.1.0-2) ...
Selecting previously unselected package librdmacm1-dbgsym.
Preparing to unpack librdmacm1-dbgsym_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb ...
Unpacking librdmacm1-dbgsym (41mlnx1-OFED.4.2.0.1.3.43101) ...
Selecting previously unselected package librdmacm-dev.
Preparing to unpack librdmacm-dev_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb ...
Unpacking librdmacm-dev (41mlnx1-OFED.4.2.0.1.3.43101) ...
Selecting previously unselected package librdmacm-utils.
Preparing to unpack librdmacm-utils_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb ...
Unpacking librdmacm-utils (41mlnx1-OFED.4.2.0.1.3.43101) ...
dpkg: error processing archive librdmacm-utils_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb (--install):
 trying to overwrite '/usr/bin/cmtime', which is also in package rdmacm-utils 1.1.0-2
Selecting previously unselected package librdmacm-utils-dbgsym.
Preparing to unpack librdmacm-utils-dbgsym_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb ...
Unpacking librdmacm-utils-dbgsym (41mlnx1-OFED.4.2.0.1.3.43101) ...
Selecting previously unselected package libvma.
Preparing to unpack libvma_8.5.7-1_amd64.deb ...
Unpacking libvma (8.5.7-1) ...
Selecting previously unselected package libvma-dbg.
Preparing to unpack libvma-dbg_8.5.7-1_amd64.deb ...
Unpacking libvma-dbg (8.5.7-1) ...
Selecting previously unselected package libvma-dbgsym.
Preparing to unpack libvma-dbgsym_8.5.7-1_amd64.deb ...
Unpacking libvma-dbgsym (8.5.7-1) ...
Selecting previously unselected package libvma-dev.
Preparing to unpack libvma-dev_8.5.7-1_amd64.deb ...
Unpacking libvma-dev (8.5.7-1) ...
Selecting previously unselected package libvma-utils.
Preparing to unpack libvma-utils_8.5.7-1_amd64.deb ...
Unpacking libvma-utils (8.5.7-1) ...
Selecting previously unselected package libvma-utils-dbgsym.
Preparing to unpack libvma-utils-dbgsym_8.5.7-1_amd64.deb ...
Unpacking libvma-utils-dbgsym (8.5.7-1) ...
Selecting previously unselected package mlnx-en-dkms.
Preparing to unpack mlnx-en-dkms_4.3-1.0.1.0.g8509e41_all.deb ...
Unpacking mlnx-en-dkms (4.3-1.0.1.0.g8509e41) ...
Selecting previously unselected package mlnx-en-dpdk-4.15.17-2-pve.
Preparing to unpack mlnx-en-dpdk-4.15.17-2-pve_4.3-1.0.1.0_all.deb ...
Unpacking mlnx-en-dpdk-4.15.17-2-pve (4.3-1.0.1.0) ...
Selecting previously unselected package mlnx-en-dpdk.
Preparing to unpack mlnx-en-dpdk_4.3-1.0.1.0_all.deb ...
Unpacking mlnx-en-dpdk (4.3-1.0.1.0) ...
Selecting previously unselected package mlnx-en-modules.
Preparing to unpack mlnx-en-modules_4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve_all.deb ...
Unpacking mlnx-en-modules (4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve) ...
dpkg: error processing archive mlnx-en-modules_4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve_all.deb (--install):
 trying to overwrite '/usr/src/mlnx-en-4.3/COPYING', which is also in package mlnx-en-dkms 4.3-1.0.1.0.g8509e41
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Selecting previously unselected package mlnx-en-utils.
Preparing to unpack mlnx-en-utils_4.3-1.0.1.0.g8509e41_amd64.deb ...
Unpacking mlnx-en-utils (4.3-1.0.1.0.g8509e41) ...
Preparing to unpack mlnx-en-utils_4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve_amd64.deb ...
Unpacking mlnx-en-utils (4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve) over (4.3-1.0.1.0.g8509e41) ...
Selecting previously unselected package mlnx-en-vma-4.15.17-2-pve.
Preparing to unpack mlnx-en-vma-4.15.17-2-pve_4.3-1.0.1.0_all.deb ...
Unpacking mlnx-en-vma-4.15.17-2-pve (4.3-1.0.1.0) ...
Selecting previously unselected package mlnx-en-vma.
Preparing to unpack mlnx-en-vma_4.3-1.0.1.0_all.deb ...
Unpacking mlnx-en-vma (4.3-1.0.1.0) ...
Selecting previously unselected package mlnx-fw-updater.
Preparing to unpack mlnx-fw-updater_4.3-1.0.1.0_amd64.deb ...
Unpacking mlnx-fw-updater (4.3-1.0.1.0) ...
Selecting previously unselected package mlnx-ofed-kernel-dkms.
Preparing to unpack mlnx-ofed-kernel-dkms_4.3-OFED.4.3.1.0.1.1.g8509e41_all.deb ...
Unpacking mlnx-ofed-kernel-dkms (4.3-OFED.4.3.1.0.1.1.g8509e41) ...
Selecting previously unselected package mlnx-ofed-kernel-modules.
Preparing to unpack mlnx-ofed-kernel-modules_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_all.deb ...
Unpacking mlnx-ofed-kernel-modules (4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve) ...
dpkg: error processing archive mlnx-ofed-kernel-modules_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_all.deb (--install):
 trying to overwrite '/usr/src/mlnx-ofed-kernel-4.3/COPYING', which is also in package mlnx-ofed-kernel-dkms 4.3-OFED.4.3.1.0.1.1.g8509e41
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Selecting previously unselected package mlnx-ofed-kernel-utils.
Preparing to unpack mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41_amd64.deb ...
Unpacking mlnx-ofed-kernel-utils (4.3-OFED.4.3.1.0.1.1.g8509e41) ...
dpkg: error processing archive mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41_amd64.deb (--install):
 trying to overwrite '/sbin/sysctl_perf_tuning', which is also in package mlnx-en-utils 4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve
Preparing to unpack mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_amd64.deb ...
Unpacking mlnx-ofed-kernel-utils (4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve) ...
dpkg: error processing archive mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_amd64.deb (--install):
 trying to overwrite '/sbin/sysctl_perf_tuning', which is also in package mlnx-en-utils 4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve
Preparing to unpack mstflint_4.9.0-1.2.gb839ec8.43101_amd64.deb ...
Unpacking mstflint (4.9.0-1.2.gb839ec8.43101) over (4.6.0-1) ...
Selecting previously unselected package mstflint-dbgsym.
Preparing to unpack mstflint-dbgsym_4.9.0-1.2.gb839ec8.43101_amd64.deb ...
Unpacking mstflint-dbgsym (4.9.0-1.2.gb839ec8.43101) ...
Selecting previously unselected package ofed-scripts.
Preparing to unpack ofed-scripts_4.3-OFED.4.3.1.0.1_amd64.deb ...
Unpacking ofed-scripts (4.3-OFED.4.3.1.0.1) ...
Selecting previously unselected package sockperf.
Preparing to unpack sockperf_3.3-32.git6bc436e7b4a0.43101_amd64.deb ...
Unpacking sockperf (3.3-32.git6bc436e7b4a0.43101) ...
Selecting previously unselected package sockperf-dbgsym.
Preparing to unpack sockperf-dbgsym_3.3-32.git6bc436e7b4a0.43101_amd64.deb ...
Unpacking sockperf-dbgsym (3.3-32.git6bc436e7b4a0.43101) ...
More than one copy of package mlnx-en-utils has been unpacked
 in this run !  Only configuring it once.
Setting up libibverbs1 (41mlnx1-OFED.4.3.0.1.8.43101) ...
Setting up libibverbs1-dbg (41mlnx1-OFED.4.3.0.1.8.43101) ...
Setting up libibverbs-dev (41mlnx1-OFED.4.3.0.1.8.43101) ...
Setting up libmlx4-1 (41mlnx1-OFED.4.1.0.1.0.43101) ...
Setting up libmlx4-1-dbg (41mlnx1-OFED.4.1.0.1.0.43101) ...
Setting up libmlx5-1 (41mlnx1-OFED.4.3.0.2.1.43101) ...
Setting up libmlx5-1-dbg (41mlnx1-OFED.4.3.0.2.1.43101) ...
Setting up libmlx5-dev (41mlnx1-OFED.4.3.0.2.1.43101) ...
Setting up librdmacm1 (41mlnx1-OFED.4.2.0.1.3.43101) ...
Setting up librdmacm1-dbgsym (41mlnx1-OFED.4.2.0.1.3.43101) ...
Setting up librdmacm-dev (41mlnx1-OFED.4.2.0.1.3.43101) ...
dpkg: dependency problems prevent configuration of librdmacm-utils-dbgsym:
 librdmacm-utils-dbgsym depends on librdmacm-utils (= 41mlnx1-OFED.4.2.0.1.3.43101); however:
  Package librdmacm-utils is not installed.

dpkg: error processing package librdmacm-utils-dbgsym (--install):
 dependency problems - leaving unconfigured
Setting up libvma (8.5.7-1) ...
[ ok ] Starting vma (via systemctl): vma.service.
Setting up libvma-dbg (8.5.7-1) ...
Setting up libvma-dbgsym (8.5.7-1) ...
Setting up libvma-dev (8.5.7-1) ...
Setting up libvma-utils (8.5.7-1) ...
Setting up libvma-utils-dbgsym (8.5.7-1) ...
dpkg: dependency problems prevent configuration of mlnx-en-dkms:
 mlnx-en-dkms depends on dkms; however:
  Package dkms is not installed.

dpkg: error processing package mlnx-en-dkms (--install):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of mlnx-en-dpdk-4.15.17-2-pve:
 mlnx-en-dpdk-4.15.17-2-pve depends on librdmacm-utils (>= 41mlnx1-OFED.4.2.0.1.3.43101); however:
  Package librdmacm-utils is not installed.
 mlnx-en-dpdk-4.15.17-2-pve depends on mlnx-ofed-kernel-utils (>= 4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve); however:
  Package mlnx-ofed-kernel-utils is not installed.
 mlnx-en-dpdk-4.15.17-2-pve depends on mlnx-ofed-kernel-modules (>= 4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve); however:
  Package mlnx-ofed-kernel-modules is not installed.

dpkg: error processing package mlnx-en-dpdk-4.15.17-2-pve (--install):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of mlnx-en-dpdk:
 mlnx-en-dpdk depends on librdmacm-utils (>= 41mlnx1-OFED.4.2.0.1.3.43101); however:
  Package librdmacm-utils is not installed.
 mlnx-en-dpdk depends on mlnx-ofed-kernel-utils (>= 4.3-OFED.4.3.1.0.1.1.g8509e41); however:
  Package mlnx-ofed-kernel-utils is not installed.

dpkg: error processing package mlnx-en-dpdk (--install):
 dependency problems - leaving unconfigured
Setting up mlnx-en-utils (4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve) ...
dpkg: dependency problems prevent configuration of mlnx-en-vma-4.15.17-2-pve:
 mlnx-en-vma-4.15.17-2-pve depends on mlnx-ofed-kernel-utils (>= 4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve); however:
  Package mlnx-ofed-kernel-utils is not installed.
 mlnx-en-vma-4.15.17-2-pve depends on librdmacm-utils (>= 41mlnx1-OFED.4.2.0.1.3.43101); however:
  Package librdmacm-utils is not installed.
 mlnx-en-vma-4.15.17-2-pve depends on mlnx-ofed-kernel-modules (>= 4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve); however:
  Package mlnx-ofed-kernel-modules is not installed.

dpkg: error processing package mlnx-en-vma-4.15.17-2-pve (--install):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of mlnx-en-vma:
 mlnx-en-vma depends on librdmacm-utils (>= 41mlnx1-OFED.4.2.0.1.3.43101); however:
  Package librdmacm-utils is not installed.
 mlnx-en-vma depends on mlnx-ofed-kernel-utils (>= 4.3-OFED.4.3.1.0.1.1.g8509e41); however:
  Package mlnx-ofed-kernel-utils is not installed.

dpkg: error processing package mlnx-en-vma (--install):
 dependency problems - leaving unconfigured
Setting up mlnx-fw-updater (4.3-1.0.1.0) ...
Attempting to perform Firmware update...
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX3Pro
  Part Number:      MCX314A-BCC_Ax
  Description:      ConnectX-3 Pro EN network interface card; 40GigE; dual-port QSFP; PCIe3.0 x8 8GT/s; RoHS R6
  PSID:             MT_1090111023
  PCI Device Name:  81:00.0
  Port1 MAC:        248a07e26070
  Port2 MAC:        248a07e26071
  Versions:         Current        Available
     FW             2.42.5000      2.42.5000
     PXE            3.4.0752       3.4.0752

  Status:           Up to date


Log File: /tmp/mlnx_fw_update.log
dpkg: dependency problems prevent configuration of mlnx-ofed-kernel-dkms:
 mlnx-ofed-kernel-dkms depends on dkms; however:
  Package dkms is not installed.
 mlnx-ofed-kernel-dkms depends on mlnx-ofed-kernel-utils; however:
  Package mlnx-ofed-kernel-utils is not installed.

dpkg: error processing package mlnx-ofed-kernel-dkms (--install):
 dependency problems - leaving unconfigured
Setting up mstflint (4.9.0-1.2.gb839ec8.43101) ...
Setting up mstflint-dbgsym (4.9.0-1.2.gb839ec8.43101) ...
Setting up ofed-scripts (4.3-OFED.4.3.1.0.1) ...
Setting up sockperf (3.3-32.git6bc436e7b4a0.43101) ...
Setting up sockperf-dbgsym (3.3-32.git6bc436e7b4a0.43101) ...
Setting up ibverbs-utils (41mlnx1-OFED.4.3.0.1.8.43101) ...
Processing triggers for man-db (2.7.6.1-2) ...
Processing triggers for libc-bin (2.24-11+deb9u3) ...
Processing triggers for systemd (232-25+deb9u2) ...
Errors were encountered while processing:
 librdmacm-utils_41mlnx1-OFED.4.2.0.1.3.43101_amd64.deb
 mlnx-en-modules_4.3-1.0.1.0.g8509e41.kver.4.15.17-2-pve_all.deb
 mlnx-ofed-kernel-modules_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_all.deb
 mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41_amd64.deb
 mlnx-ofed-kernel-utils_4.3-OFED.4.3.1.0.1.1.g8509e41.kver.4.15.17-2-pve_amd64.deb
 librdmacm-utils-dbgsym
 mlnx-en-dkms
 mlnx-en-dpdk-4.15.17-2-pve
 mlnx-en-dpdk
 mlnx-en-vma-4.15.17-2-pve
 mlnx-en-vma
 mlnx-ofed-kernel-dkms

and
Code:
root@pve01:/usr/local/src/mlnx-en-4.3-1.0.1.0-debian9.4-x86_64-ext/DEBS# apt --fix-broken install
Reading package lists... Done
Building dependency tree
Reading state information... Done
Correcting dependencies... Done
The following package was automatically installed and is no longer required:
  libmuparser2v5
Use 'apt autoremove' to remove it.
The following additional packages will be installed:
  dkms sudo
Suggested packages:
  python3-apport menu
The following packages will be REMOVED:
  librdmacm-utils-dbgsym mlnx-en-dpdk mlnx-en-dpdk-4.15.17-2-pve mlnx-en-vma mlnx-en-vma-4.15.17-2-pve mlnx-ofed-kernel-dkms
The following NEW packages will be installed:
  dkms sudo
0 upgraded, 2 newly installed, 6 to remove and 2 not upgraded.
7 not fully installed or removed.
Need to get 1,130 kB of archives.
After this operation, 14.1 MB disk space will be freed.
Do you want to continue? [Y/n] y
Get:1 http://ftp.de.debian.org/debian stretch/main amd64 dkms all 2.3-2 [74.8 kB]
Get:2 http://ftp.de.debian.org/debian stretch/main amd64 sudo amd64 1.8.19p1-2.1 [1,055 kB]
Fetched 1,130 kB in 0s (1,585 kB/s)
(Reading database ... 239205 files and directories currently installed.)
Removing librdmacm-utils-dbgsym (41mlnx1-OFED.4.2.0.1.3.43101) ...
Removing mlnx-en-dpdk (4.3-1.0.1.0) ...
Removing mlnx-en-dpdk-4.15.17-2-pve (4.3-1.0.1.0) ...
Removing mlnx-en-vma (4.3-1.0.1.0) ...
Removing mlnx-en-vma-4.15.17-2-pve (4.3-1.0.1.0) ...
Removing mlnx-ofed-kernel-dkms (4.3-OFED.4.3.1.0.1.1.g8509e41) ...
Selecting previously unselected package dkms.
(Reading database ... 237746 files and directories currently installed.)
Preparing to unpack .../archives/dkms_2.3-2_all.deb ...
Unpacking dkms (2.3-2) ...
Selecting previously unselected package sudo.
Preparing to unpack .../sudo_1.8.19p1-2.1_amd64.deb ...
Unpacking sudo (1.8.19p1-2.1) ...
Setting up sudo (1.8.19p1-2.1) ...
Setting up dkms (2.3-2) ...
Processing triggers for systemd (232-25+deb9u2) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up mlnx-en-dkms (4.3-1.0.1.0.g8509e41) ...

Creating symlink /var/lib/dkms/mlnx-en/4.3/source ->
                 /usr/src/mlnx-en-4.3

DKMS: add completed.

Kernel preparation unnecessary for this kernel.  Skipping...

Building module:
cleaning build area...(bad exit status: 2)
./scripts/mlnx_en_patch.sh --kernel 4.15.17-2-pve --kernel-sources /lib/modules/4.15.17-2-pve/build -j16 && make -j16............................
cleaning build area...

DKMS: build completed.
Forcing installation of mlnx-en

mlx_compat:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlx4_core.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlx4_ib.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlx4_en.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlx5_core.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlx5_ib.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

mlxfw.ko:
Running module version sanity check.
 - Original module
 - Installation
   - Installing to /lib/modules/4.15.17-2-pve/updates/dkms/

depmod...

Backing up initrd.img-4.15.17-2-pve to /boot/initrd.img-4.15.17-2-pve.old-dkms
Making new initrd.img-4.15.17-2-pve
(If next boot fails, revert to initrd.img-4.15.17-2-pve.old-dkms image)
update-initramfs.....

DKMS: install completed.

and

Code:
 cat /tmp/mlnx_fw_update.log
CMD: mlxup -u --log-on-update --ssl-certificate /tmp/mlnx.fw.446117/mlxfwmanager_sriov_dis_x86_64-dir/ca-bundle.crt --current-dir /opt/mellanox/mlnx-fw-updater/  -L /tmp/mlnx_fw_update.log -y -d 81:00.0
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX3Pro
  Part Number:      MCX314A-BCC_Ax
  Description:      ConnectX-3 Pro EN network interface card; 40GigE; dual-port QSFP; PCIe3.0 x8 8GT/s; RoHS R6
  PSID:             MT_1090111023
  PCI Device Name:  81:00.0
  Port1 MAC:        248a07e26070
  Port2 MAC:        248a07e26071
  Versions:         Current        Available
     FW             2.42.5000      2.42.5000
     PXE            3.4.0752       3.4.0752

  Status:           Up to date


EXIT_STATUS: 0

now I reboot 1st node lets cross our fingers
 
You can actually use the mlnx_install script BUT you have to remove the pve packages first- pve conflicts with the mlnx packages but NOT THE OTHER WAY AROUND, so you can reinstall proxmox-ve after ofed- like so:
  1. Apt-get remove proxmox-ve pve*
  2. Apt-get install pve-kernel-x.xx.x-x-pve pve-headers (the kernel will be uninstalled in step 1)
  3. Navigate to ofed directory
  4. ./mlnxofedinstall --skip-distro-check --force (choose additional switches are relevent)
  5. /etc/init.d/openibd restart
  6. Apt-get install proxmox-ve
 
  • Like
Reactions: elurex
i did not succeed !

both mon and mgr did not start


Code:
Jun  6 21:15:03 pve01 ceph-mgr[5485]:     -2> 2018-06-06 21:15:03.683201 7f8b502d66c0 -1 RDMAStack RDMAStack!!! WARNING !!! For RDMA to work properly user memlock (ulimit -l) must be big enough to allow large amount of registered memory. We recommend setting this parameter to infinity
Jun  6 21:15:04 pve01 ceph-mon[5502]: 2018-06-06 21:15:04.561285 7f44dfbc8f80 -1 RDMAStack RDMAStack!!! WARNING !!! For RDMA to work properly user memlock (ulimit -l) must be big enough to allow large amount of registered memory. We recommend setting this parameter to infinity

limits are set as of mellanox whitepaper !

Code:
 cat /etc/systemd/system/ceph-mon@.service

[Unit]
Description=Ceph cluster monitor daemon

# According to:
#   http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
# these can be removed once ceph-mon will dynamically change network
# configuration.
After=network-online.target local-fs.target time-sync.target
Wants=network-online.target local-fs.target time-sync.target

PartOf=ceph-mon.target

[Service]
LimitNOFILE=1048576
LimitNPROC=1048576
LimitMEMLOCK=infinity
EnvironmentFile=-/etc/default/ceph
Environment=CLUSTER=ceph
ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph
ExecReload=/bin/kill -HUP $MAINPID
PrivateDevices=no
ProtectHome=true
ProtectSystem=full
PrivateTmp=true
TasksMax=infinity
Restart=on-failure
StartLimitInterval=30min
StartLimitBurst=5
RestartSec=10

[Install]
WantedBy=ceph-mon.target
Code:
cat /etc/systemd/system/ceph-mgr@.service
[Unit]
Description=Ceph cluster manager daemon
After=network-online.target local-fs.target time-sync.target
Wants=network-online.target local-fs.target time-sync.target
PartOf=ceph-mgr.target

[Service]
LimitNOFILE=1048576
LimitNPROC=1048576
LimitMEMLOCK=infinity
EnvironmentFile=-/etc/default/ceph
Environment=CLUSTER=ceph

ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
StartLimitInterval=30min
StartLimitBurst=3

[Install]
WantedBy=ceph-mgr.target
 
I reverted my ceph.conf to tcp mode, cluster is running...

but with these new drivers i cant rping or udaddy ...
udaddy
failed to create event channel: No such device


rping -s -C 10 -v
rdma_create_event_channel: No such device


something got broken i suppose ...

any hints ?
 
You can actually use the mlnx_install script BUT you have to remove the pve packages first- pve conflicts with the mlnx packages but NOT THE OTHER WAY AROUND, so you can reinstall proxmox-ve after ofed- like so:
  1. Apt-get remove proxmox-ve pve*
  2. Apt-get install pve-kernel-x.xx.x-x-pve pve-headers (the kernel will be uninstalled in step 1)
  3. Navigate to ofed directory
  4. ./mlnxofedinstall --skip-distro-check --force (choose additional switches are relevent)
  5. /etc/init.d/openibd restart
  6. Apt-get install proxmox-ve
did not work ...
Code:
 apt-get remove proxmox-ve pve*
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package pve
E: Unable to locate package pveam.log
E: Couldn't find any package by glob 'pveam.log'
E: Couldn't find any package by regex 'pveam.log'
E: Unable to locate package pveam.log.0
E: Couldn't find any package by glob 'pveam.log.0'
E: Couldn't find any package by regex 'pveam.log.0'
E: Unable to locate package pve-firewall.log
E: Couldn't find any package by glob 'pve-firewall.log'
E: Couldn't find any package by regex 'pve-firewall.log'
E: Unable to locate package pve-firewall.log.1
E: Couldn't find any package by glob 'pve-firewall.log.1'
E: Couldn't find any package by regex 'pve-firewall.log.1'
E: Unable to locate package pve-firewall.log.2.gz
E: Couldn't find any package by glob 'pve-firewall.log.2.gz'
E: Couldn't find any package by regex 'pve-firewall.log.2.gz'
E: Unable to locate package pve-firewall.log.3.gz
E: Couldn't find any package by glob 'pve-firewall.log.3.gz'
E: Couldn't find any package by regex 'pve-firewall.log.3.gz'
E: Unable to locate package pve-firewall.log.4.gz
E: Couldn't find any package by glob 'pve-firewall.log.4.gz'
E: Couldn't find any package by regex 'pve-firewall.log.4.gz'
E: Unable to locate package pve-firewall.log.5.gz
E: Couldn't find any package by glob 'pve-firewall.log.5.gz'
E: Couldn't find any package by regex 'pve-firewall.log.5.gz'
E: Unable to locate package pve-firewall.log.6.gz
E: Couldn't find any package by glob 'pve-firewall.log.6.gz'
E: Couldn't find any package by regex 'pve-firewall.log.6.gz'
E: Unable to locate package pve-firewall.log.7.gz
E: Couldn't find any package by glob 'pve-firewall.log.7.gz'
E: Couldn't find any package by regex 'pve-firewall.log.7.gz'
E: Unable to locate package pveproxy
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!