Best practices for network config: BCM Nic and Cisco Switch

Oct 8, 2024
1
0
1
Hello,
are there any best practices/hints for an optimal network setup with ceph (block storage ) available
for servers with following cards

BCM 5708 1 x ( 2 Port 100 GB ) used for ceph
BCM 57504 2 x ( 4 Port 100 GB )


TOR Switches 2 x Cisco 9364c-gx .

We want to use bonding with LACP 802.3 e.g 2 port 100 GB over 2 switches.



As there are a lot of parameters that can be tuned it would help to have some hints/best practices
as this is not an exotic HW config ( BCM, Cisco ) .


=== 1 set some values in Nic bios ===

flow offload enabled, perf profile RoCE, Link FEC CL91, NIC RDMA mode enabled

=== 2 Applied the bcm linux driver ===
installed niccli and set:

setoption -name firmware_link_speed_d0 -value 6 -scope 0
setoption -name firmware_link_speed_d0 -value 6 -scope 1
setoption -name firmware_link_speed_d3 -value 6 -scope 0
setoption -name firmware_link_speed_d3 -value 6 -scope 1

=== 3 interfaces config: set flow control and mtu size ===

auto ens3f0np0
iface ens3f0np0 inet manual
mtu 9216
post-up ethtool $IFACE rx 2047 tx 2047 tx flow-control on rx flow-control on

auto ens3f1np1
iface ens3f1np1 inet manual
mtu 9216
post-up ethtool $IFACE rx 2047 tx 2047 tx flow-control on rx flow-control on

auto bond0
iface bond0 inet static
address 172.16.10.10/24
bond-slaves ens3f0np0 ens3f1np1
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
mtu 9216


== added sysctl /etc/sysctl.d# cat 40-bcm57508.conf ===

# allow TCP with buffers up to 2GB (Max allowed in Linux is 2GB-1)

net.core.rmem_max=2147483647

net.core.wmem_max=2147483647

# increase TCP autotuning buffer limits.

net.ipv4.tcp_rmem=4096 131072 1073741824

net.ipv4.tcp_wmem=4096 16384 1073741824

# recommended for hosts with jumbo frames enabled

net.ipv4.tcp_mtu_probing=1

# recommended to enable 'fair queueing'

net.core.default_qdisc = fq

# need to increase this to use MSG_ZEROCOPY

net.core.optmem_max = 1048576

The iperf ( iperf -P 8 -t 60 ) measure between interfaces show still some results with variation:


LOG.1:[SUM] 0.0000-60.0001 sec 1.30 TBytes 191 Gbits/sec
LOG.10:[SUM] 0.0000-60.0021 sec 1.32 TBytes 193 Gbits/sec
LOG.2:[SUM] 0.0000-60.0001 sec 1.33 TBytes 195 Gbits/sec
LOG.3:[SUM] 0.0000-60.0045 sec 1.25 TBytes 183 Gbits/sec
LOG.4:[SUM] 0.0000-60.0062 sec 1.26 TBytes 185 Gbits/sec
LOG.5:[SUM] 0.0000-60.0023 sec 1.30 TBytes 191 Gbits/sec
LOG.6:[SUM] 0.0000-60.0026 sec 1.20 TBytes 176 Gbits/sec
LOG.7:[SUM] 0.0000-60.0030 sec 1.30 TBytes 191 Gbits/sec
LOG.8:[SUM] 0.0000-60.0001 sec 1.09 TBytes 159 Gbits/sec
LOG.9:[SUM] 0.0000-60.0000 sec 1.32 TBytes 193 Gbits/sec




Thanks and best regards
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!