Hi everyone,
I’m running a Ceph cluster on Proxmox VE and considering enabling hardware offloading on my Mellanox ConnectX-6 Lx NICs for both the Public and Cluster networks. My setup includes high-performance NVMe OSDs, and I’m using 25 Gbit/s links with MTU 9000 (Jumbo Frames) configured consistently across nodes and switches. The goal is to optimize network throughput and reduce CPU load for Ceph’s replication and heartbeat traffic.
My questions are:
I’m running a Ceph cluster on Proxmox VE and considering enabling hardware offloading on my Mellanox ConnectX-6 Lx NICs for both the Public and Cluster networks. My setup includes high-performance NVMe OSDs, and I’m using 25 Gbit/s links with MTU 9000 (Jumbo Frames) configured consistently across nodes and switches. The goal is to optimize network throughput and reduce CPU load for Ceph’s replication and heartbeat traffic.
My questions are:
- Does enabling hardware offloading (e.g., TSO, GSO, RX/TX Checksum Offload, or LRO) make sense for a Ceph cluster on Proxmox VE with ConnectX-6 Lx NICs? Are there specific offloading features I should prioritize or avoid (e.g., LRO with LACP)?
- Have you experienced any compatibility issues or performance drawbacks with offloading enabled on ConnectX-6 Lx NICs in a similar setup?
- Are there specific driver settings or Ceph network configurations I should consider to maximize performance while ensuring stability?