Should I Enable Hardware Offloading on ConnectX-6 Lx NICs for a Ceph Cluster on Proxmox VE?

devaux

Active Member
Feb 3, 2024
190
44
28
Hi everyone,
I’m running a Ceph cluster on Proxmox VE and considering enabling hardware offloading on my Mellanox ConnectX-6 Lx NICs for both the Public and Cluster networks. My setup includes high-performance NVMe OSDs, and I’m using 25 Gbit/s links with MTU 9000 (Jumbo Frames) configured consistently across nodes and switches. The goal is to optimize network throughput and reduce CPU load for Ceph’s replication and heartbeat traffic.
My questions are:
  1. Does enabling hardware offloading (e.g., TSO, GSO, RX/TX Checksum Offload, or LRO) make sense for a Ceph cluster on Proxmox VE with ConnectX-6 Lx NICs? Are there specific offloading features I should prioritize or avoid (e.g., LRO with LACP)?
  2. Have you experienced any compatibility issues or performance drawbacks with offloading enabled on ConnectX-6 Lx NICs in a similar setup?
  3. Are there specific driver settings or Ceph network configurations I should consider to maximize performance while ensuring stability?
Any insights, experiences, or recommendations for tuning these NICs with Ceph and Proxmox would be greatly appreciated. Thanks!