Full meshed Ceph with Mellanox Connectx-4 LX

maddig

New Member
Jan 24, 2025
2
2
3
Hi there,

currently im trying to configure a 3node full meshed ceph cluster with the official guide and the routed with fallback method.
https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Routed_Setup_(with_Fallback)

The nodes are connected with 25g DAC cables via Mellanox connectx-4 lx NICs.

I configured the setup exactly like in the guide, but FRR or better OpenFabric isnt seeing its neighbours and therefor it is not exchanging routing informations.
I´m also unable to ping the loopback addresses of the other hosts which are configured in the frr config.

When i assigne an ip address directly to the involved interfaces, i can ping between this interfaces.
So i think there is a problem with the loopback address.


Does anyone have any ideas on how I can further troubleshoot this?

my configs: (on all hosts the same except the hostname, ip and the ID)

frr.conf
Code:
frr defaults traditional
hostname vm001
log syslog informational
ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
!
interface lo
 ip address 10.254.120.50/32
 ip router openfabric 1
 openfabric passive
!
interface no1np0
 ip router openfabric 1
 openfabric csnp-interval 2
 openfabric hello-interval 1
 openfabric hello-multiplier 2
!
interface eno2np1
 ip router openfabric 1
 openfabric csnp-interval 2
 openfabric hello-interval 1
 openfabric hello-multiplier 2
!
line vty
!
router openfabric 1
 net 49.0001.1111.1111.1111.00
 lsp-gen-interval 1
 max-lsp-lifetime 600
 lsp-refresh-interval 180

interfaces
Code:
auto lo
iface lo inet loopback

iface ens4f0np0 inet manual

auto eno1np0
iface eno1np0 inet static
        address 10.1.1.1/24 -> this is only for testing with ping
        mtu 9000

auto eno2np1
iface eno2np1 inet manual
        mtu 9000

iface ens4f1np1 inet manual

iface ens8f0np0 inet manual

iface ens8f1np1 inet manual

auto vmbr0
iface vmbr0 inet manual
        bridge-ports ens4f0np0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        post-up devlink dev eswitch set pci/0000:af:00.0 mode switchdev
        post-up devlink dev eswitch set pci/0000:af:00.1 mode switchdev

auto mgmt
iface mgmt inet static
        address 10.160.99.75/24
        gateway 10.160.99.1
        vlan-id 99
        vlan-raw-device vmbr0

post-up /usr/bin/systemctl restart frr.service
source /etc/network/interfaces.d/*

eno1np0 and eno2np1 are the both interface for each node, which are directly connected to the other nodes.

Thank you for the help.
BG
 
Damn... after i post the thread, i realized, that i had copied a typo in the interface name on alle three hosts.. there is an e missing on the first interface name in the frr config..
 
  • Like
Reactions: gurubert and UdoB

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!