Proxmox with ceph and discards on ports

drmartins

Member
Mar 3, 2021
7
0
6
35
Hello,
i just need to confirm my presumption
We have 8 proxmox nodes with ceph. Just small ceph environment with 16 NVME disks. Servers are connect to Arista DCS-7060CX-32S by 2x 25gb dac cable in LACP bond.
Proxmox is on last 6 version (6.4-13) and last 15 version of ceph (15.2.15)
I can see in grafana a lot of discards. Posting screenshot. All ports looks +/- same
My question is, can this discard be harmful for ceph/proxmox (i see rentransmit in corosync log) and/or to application which needs stable network (etcd in kubernetes, etc)

I am pretty sure, that we have unsuitable switches but i just like to have some confirmation

Code:
Feb 16 12:58:22 srv1 corosync[5241]:   [TOTEM ] Retransmit List: b4447f
Feb 16 14:15:48 srv1 corosync[5241]:   [TOTEM ] Retransmit List: b5a7ab
Feb 16 14:17:47 srv1 corosync[5241]:   [TOTEM ] Retransmit List: b5b0c3
Feb 16 14:27:50 srv1 corosync[5241]:   [TOTEM ] Retransmit List: b5defd
Feb 16 14:30:19 srv1 corosync[5241]:   [TOTEM ] Retransmit List: b5ea62
Snímek obrazovky 2022-02-16 v 14.30.56.png
 
Hi,
It's really strange that you have some discard with so low bandwith. (I just look to my mellanox switch, I have 0 discard with 7gbit/s).

I have also some arista switch in production, they are quite good.

what is your nic model ?
 
we used mellanox connectx-5 cards. There is no problem in system interfaces stats
we also know that switches are not suitable for network storage like ceph because of small buffer (16mb) for all ports
all what i need i just confirm from someone who understand network that these discards are wrong (i am sure it is but my boss wants some assurance from network engineer)
 
if you already have corosync retransmit, it's already bad. (corosync is very sensitive to latency. Don't use HA if you already have retransmits).

it's quite possible that you have ceph spike latency too.

Note that my mellanox switches have also 16mb shared buffer, and I have 0 retransmit or discard.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!