[SOLVED] RX discards on bond with mtu 9000

Thomas Hukkelberg

Active Member
Mar 4, 2018
18
6
43
Oslo
Hi, we experience some weird RX discards on a few of our Proxmox nodes after we recently switched from single to bonded interfaces for vm-bridges, and we can't seem to figure out why. Since we utilize CEPH, we also need to have access to the CEPH cluster on the same bond, both for VMs/CTs and the hypervizor itself -- hence we have configured access to the CEPH public network on a separate vlan and vmbr. We use mellanox connectx3 and arista 7050QX. There are rejected packets are a constant rate of ~110pps on the vmbr on all nodes. From time to time when certain VMs are heavily accessed the rejects/errors peak at 25k pps.

Can anyone help to pinpoint the problem?


Proxmox node network configuration
Code:
auto enp1s0
iface enp1s0 inet static
    address 10.40.24.107/22
    gateway 10.40.24.1
# 1GbE proxmox cluster/corosync


auto enp4s0
iface enp4s0 inet manual
# 40GbE lag member A


auto enp4s0d1
iface enp4s0d1 inet manual
# 40GbE lag member B


auto bond0
iface bond0 inet manual
    bond-slaves enp4s0 enp4s0d1
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer3+4
    mtu 9000
#80GbE bond for VMs/CTs and CEPH


auto bond0.4028
iface bond0.4028 inet manual
    mtu 8244
    vlan-id 4028
#vlan for CEPH public net


auto vmbr0
iface vmbr0 inet static
    address 10.40.20.107/22
    bridge-ports bond0
    bridge-stp off
    bridge-fd 0
    mtu 9000
#bridge for VMs/CTs


auto vmbr1
iface vmbr1 inet static
    address 10.40.28.107/22
    bridge-ports bond0.4028
    bridge-stp off
    bridge-fd 0
    mtu 8244
#brigde for CEPH (both for HV and VMs/CTs)


Arista 7050QX switch config:
Code:
interface Ethernet3/1
   description hk-proxnode-07-LAG-member
   mtu 9214
   flowcontrol send on
   flowcontrol receive on
   speed forced 40gfull
   channel-group 15 mode active
!
interface Ethernet15/1
   description hk-proxnode-07-LAG-member
   mtu 9214
   flowcontrol send on
   flowcontrol receive on
   speed forced 40gfull
   channel-group 15 mode active
!
interface Port-Channel15
   description hk-proxnode-07-bond
   mtu 9214
   switchport trunk native vlan 4020
   switchport trunk allowed vlan 2-4080
   switchport mode trunk
!

See attached screenshot of discards/errors on ports.
 

Attachments

  • proxmox-bond-discards+errors.png
    proxmox-bond-discards+errors.png
    292.8 KB · Views: 23
In my experience, RX ERRORS means a physical layer problem, e.g., cable, DAC, fiber jumper, etc. between the NIC and switch. However, RX DISCARDS usually means misconfiguration, e.g., VLAN tags sent to a port not expecting them, etc.

In the attachment, that appears to be Observium showing errors on enp4s0d1 which is part of bond0. If you click on enp4s0d1, it should open to graphs for that NIC and at the bottom you see "Errors" with both Errors and Discards. If these rates (numbers) are equal, then it is just RX ERRORS so look at the physical layer.

Hope that helps.
 
Thanks for pointing out the most obvious thing this could be -- and guess what, it was a flaky NIC! After replacing it with a spare NIC there were no more errors or discards :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!