Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

syfy323 · Jan 6, 2020

Hi!

Fresh install on Supermicro X11DPi-N(T) using Proxmox v6.1.

This message is looping through the log and saturates one core:
[ 911.977646] i40e 0000:60:00.1: Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF
[ 955.642636] i40e 0000:60:00.1: Error I40E_AQ_RC_ENOSPC adding RX filters on PF, promiscuous mode forced on

The interesting part is, the PCI address is one of two links in a bond. The port is not connected / down (NO-CARRIER).

After about 20 minutes the logging stopped and the load is back to normal.

This bug report relates the problem to vlans:
https://sourceforge.net/p/e1000/bugs/575/

Indeed, I am using VLAN, actually QinQ.

I was unable to find some settings to adjust, any ideas?

Kind regards
Kevin

syfy323 · Jan 28, 2020

https://bugzilla.proxmox.com/show_bug.cgi?id=2569

WhosTheBosch · Apr 11, 2020

I had this error as well, from looking at the bug report I went ahead and changed the VLANS to only the ones I needed and this error then went away. I have the Supermicro X11SPM-TF motherboard. So I would agree with the bug report that it has to do with the X722 hardware VLAN capability.

Code:

lspci | egrep -i --color 'network|ethernet'
b5:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 08)
b5:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 08)

Code:

ethtool -k eno1
Features for eno1:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: on
receive-hashing: on
highdma: on
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off
hw-tc-offload: on
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

Note that I installed ifupdown2. The full config:

Code:

auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

auto bond0
iface bond0 inet manual
        bond-slaves eno1 eno2
        bond-miimon 100
        bond-mode 802.3ad
        bond_xmit_has_policy layer2+3
        bond_lacp_rate 1

auto vmbr0
iface vmbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 10 20

auto vmbr0.10
iface vmbr0.10 inet static
        address 192.168.10.10/24
        gateway 192.168.10.1

WhosTheBosch · Apr 11, 2020

What I did notice was odd, was that the version of the i40e driver, wasn't listed on the Intel site.

Code:

2.10.19.82 (latest)
2.10.19.30
2.9.21
2.8.43
2.7.29
2.7.26
2.7.12
2.7.11

Code:

lscip -nk
b5:00.0 0200: 8086:37d2 (rev 08)
        Subsystem: 15d9:37d2
        Kernel driver in use: i40e
        Kernel modules: i40e
b5:00.1 0200: 8086:37d2 (rev 08)
        Subsystem: 15d9:37d2
        Kernel driver in use: i40e
        Kernel modules: i40e

Code:

modinfo i40e | grep -i version
version:        2.8.20-k
srcversion:     61EE7A7018ED9599D5BADBB
vermagic:       5.3.18-3-pve SMP mod_unload modversions[CODE]

dirtyfreebooter · Dec 10, 2020

I see this as well..

But I have changed the network config to only specify the VLANs I am interested in via the bridge-vids option in /etc/network/interfaces, instead of all of them, since that Proxmox bug mentioned that X710 have limited VLAN resources.. and I haven't seen the issue in a few hours... at least. Time will tell I guess.

Code:

auto vmbr0
iface vmbr0 inet static
   address 192.168.1.3/24
   gateway 192.168.1.1
   bridge-ports enp1s0f0
   bridge-stp off
   bridge-fd 0
   bridge-vlan-aware yes
   bridge-vids 42 69

I also got a card had the newest firmware already on it, which Proxmox and kernel 5.4 aren't the newest drivers, but I haven't tried, nor to i want to deal with, installing the latest driver from Intel's website (v3.3.x)..

Code:

[ 1.342348] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.8.20-k
[ 1.347833] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[ 1.418697] i40e 0000:01:00.0: fw 8.13.63341 api 1.12 nvm 8.15 0x80009507 1.1853.0 [8086:1572] [8086:0000]
[ 1.423476] i40e 0000:01:00.0: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver.

flexyz · Jan 18, 2021

Hi

I have a similar issue, but networking seems to work fine, I just don't like the logs:

# modinfo i40e | grep -i version
version: 2.8.20-k
srcversion: B1FBC8992487C269EA33AA4
vermagic: 5.4.78-2-pve SMP mod_unload modversions

Jan 18 13:35:05 jupiter kernel: [ 11.664705] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.8.20-k
Jan 18 13:35:05 jupiter kernel: [ 11.665165] xhci_hcd 0000:00:14.0: xHCI Host Controller
Jan 18 13:35:05 jupiter kernel: [ 11.665176] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
Jan 18 13:35:05 jupiter kernel: [ 11.667396] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810
Jan 18 13:35:05 jupiter kernel: [ 11.667458] xhci_hcd 0000:00:14.0: cache line size of 32 is not supported

Jan 18 14:44:27 jupiter kernel: [ 145.060365] i40e 0000:3d:00.0: Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF
Jan 18 14:44:27 jupiter kernel: [ 145.060584] i40e 0000:3d:00.0: Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF
Jan 18 14:44:27 jupiter kernel: [ 145.060819] i40e 0000:3d:00.0: Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF
Jan 18 14:44:27 jupiter kernel: [ 145.061119] i40e 0000:3d:00.0: Error I40E_AQ_RC_ENOSPC adding RX filters on PF, promiscuous mode forced on
Jan 18 14:44:27 jupiter kernel: [ 145.063400] i40e 0000:3d:00.1: Error I40E_AQ_RC_ENOSPC adding RX filters on PF, promiscuous mode forced on

Any thoughts?

Martin Verges · Jan 18, 2021

Just set "offload-rx-vlan-filter off" in your bond config to solve the problem. Hardware is not able to handle it

flexyz · Jan 18, 2021

awesome I will try it

- so it is the Nics not being able to handle it? I also tried another Nic with SFP+ same issue

Thanks!

flexyz · Jan 18, 2021

Still see this i dmesg:

26.648130] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[ 26.651855] bonding: bond0 is being created...
[ 26.732023] IPv6: ADDRCONF(NETDEV_CHANGE): eno2: link becomes ready
[ 26.732935] i40e 0000:3d:00.0 eno1: already using mac address a4:bf:01:86:15:5b
[ 26.767216] irq 61: Affinity broken due to vector space exhaustion.
[ 26.768149] irq 62: Affinity broken due to vector space exhaustion.
[ 26.768949] irq 63: Affinity broken due to vector space exhaustion.
[ 26.769599] irq 64: Affinity broken due to vector space exhaustion.
[ 26.770234] irq 65: Affinity broken due to vector space exhaustion.
[ 26.770887] irq 66: Affinity broken due to vector space exhaustion.
[ 26.771563] irq 67: Affinity broken due to vector space exhaustion.
[ 26.772218] irq 68: Affinity broken due to vector space exhaustion.
[ 26.772825] irq 69: Affinity broken due to vector space exhaustion.
[ 26.773419] irq 70: Affinity broken due to vector space exhaustion.
[ 26.774015] irq 71: Affinity broken due to vector space exhaustion.
[ 26.774600] irq 72: Affinity broken due to vector space exhaustion.
[ 26.775228] irq 73: Affinity broken due to vector space exhaustion.
[ 26.775822] irq 74: Affinity broken due to vector space exhaustion.
[ 26.776411] irq 75: Affinity broken due to vector space exhaustion.
[ 26.776985] irq 76: Affinity broken due to vector space exhaustion.
[ 26.777678] irq 93: Affinity broken due to vector space exhaustion.
[ 26.778230] irq 94: Affinity broken due to vector space exhaustion.
[ 26.778758] irq 95: Affinity broken due to vector space exhaustion.
[ 26.779329] irq 96: Affinity broken due to vector space exhaustion.
[ 26.779854] irq 97: Affinity broken due to vector space exhaustion.
[ 26.780366] irq 98: Affinity broken due to vector space exhaustion.
[ 26.780889] irq 99: Affinity broken due to vector space exhaustion.
[ 26.781379] irq 100: Affinity broken due to vector space exhaustion.
[ 26.781867] irq 101: Affinity broken due to vector space exhaustion.
[ 26.782351] irq 102: Affinity broken due to vector space exhaustion.
[ 26.782813] irq 103: Affinity broken due to vector space exhaustion.
[ 26.783329] irq 112: Affinity broken due to vector space exhaustion.
[ 26.783773] irq 113: Affinity broken due to vector space exhaustion.
[ 26.784211] irq 114: Affinity broken due to vector space exhaustion.
[ 26.784664] irq 115: Affinity broken due to vector space exhaustion.
[ 26.785084] irq 116: Affinity broken due to vector space exhaustion.
[ 26.786441] bond0: (slave eno1): Enslaving as a backup interface with an up link
[ 26.851399] i40e 0000:3d:00.1 eno2: set new mac address a4:bf:01:86:15:5b
[ 26.889448] irq 169: Affinity broken due to vector space exhaustion.
[ 26.890979] irq 170: Affinity broken due to vector space exhaustion.
[ 26.891549] irq 171: Affinity broken due to vector space exhaustion.
[ 26.892079] irq 172: Affinity broken due to vector space exhaustion.
[ 26.892568] irq 173: Affinity broken due to vector space exhaustion.
[ 26.893017] irq 174: Affinity broken due to vector space exhaustion.
[ 26.893448] irq 175: Affinity broken due to vector space exhaustion.
[ 26.893871] irq 176: Affinity broken due to vector space exhaustion.
[ 26.894282] irq 177: Affinity broken due to vector space exhaustion.
[ 26.894680] irq 178: Affinity broken due to vector space exhaustion.
[ 26.895099] irq 179: Affinity broken due to vector space exhaustion.
[ 26.895482] irq 180: Affinity broken due to vector space exhaustion.
[ 26.895861] irq 181: Affinity broken due to vector space exhaustion.
[ 26.896235] irq 182: Affinity broken due to vector space exhaustion.
[ 26.896625] irq 183: Affinity broken due to vector space exhaustion.
[ 26.896970] irq 184: Affinity broken due to vector space exhaustion.
[ 26.897439] irq 201: Affinity broken due to vector space exhaustion.
[ 26.897771] irq 202: Affinity broken due to vector space exhaustion.
[ 26.898095] irq 203: Affinity broken due to vector space exhaustion.
[ 26.898411] irq 204: Affinity broken due to vector space exhaustion.
[ 26.898724] irq 205: Affinity broken due to vector space exhaustion.
[ 26.899076] irq 206: Affinity broken due to vector space exhaustion.
[ 26.899393] irq 207: Affinity broken due to vector space exhaustion.
[ 26.899706] irq 208: Affinity broken due to vector space exhaustion.
[ 26.900011] irq 209: Affinity broken due to vector space exhaustion.
[ 26.900335] irq 210: Affinity broken due to vector space exhaustion.
[ 26.900624] irq 211: Affinity broken due to vector space exhaustion.
[ 26.900906] irq 212: Affinity broken due to vector space exhaustion.
[ 26.901181] irq 213: Affinity broken due to vector space exhaustion.
[ 26.901455] irq 214: Affinity broken due to vector space exhaustion.
[ 26.901725] irq 215: Affinity broken due to vector space exhaustion.
[ 26.901988] irq 216: Affinity broken due to vector space exhaustion.
[ 26.903033] bond0: (slave eno2): Enslaving as a backup interface with an up link
[ 26.907560] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 26.933296] vmbr0: port 1(bond0) entered blocking state
[ 26.933576] vmbr0: port 1(bond0) entered disabled state
[ 26.933940] device bond0 entered promiscuous mode
[ 26.934194] device eno1 entered promiscuous mode
[ 26.934463] device eno2 entered promiscuous mode
[ 26.937361] vmbr0: port 1(bond0) entered blocking state
[ 26.937616] vmbr0: port 1(bond0) entered forwarding state
[ 27.049355] device bond0 left promiscuous mode
[ 27.049594] device eno1 left promiscuous mode
[ 27.049844] device eno2 left promiscuous mode
[ 27.146285] bpfilter: Loaded bpfilter_umh pid 4506
[ 27.146439] Started bpfilter
[ 27.895286] i40e 0000:3d:00.1: eno2 is entering allmulti mode.
[ 27.896363] i40e 0000:3d:00.0: eno1 is entering allmulti mode.

*************

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto ens803f0
iface ens803f0 inet manual

auto ens803f1
iface ens803f1 inet manual

auto bond0
iface bond0 inet manual
slaves eno1 eno2
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
offload-rx-vlan-filter off

auto vmbr0
iface vmbr0 inet static
address 192.168.183.10
netmask 255.255.255.0
gateway 192.168.183.1
bridge-ports bond0
bridge-vlan_aware yes
bridge-stp off
bridge-fd 0

************

Ksperis · Jun 24, 2021

Note that the option offload-rx-vlan-filter in network/interfaces requires the installation of the ethtool package (which is not installed by default).

sigmarb · Dec 4, 2021

Here is my "success" story with this bug.

Getting rid of the logging was good but not the solution. I silenced the logs with:

Code:

auto vmbr0
iface vmbr0 inet manual
#iface vmbr0 inet static
        bridge-ports bond1
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 100

Note the bridge-vids. I currently only have one vlan. Log was clean, but system was still hanging as op reported.

offload-rx-vlan-filter alone did not fix the issue.

Did to do a FW-Upgrade of the nics and had to set several offloading features off for the nics and bonds. No idea if setting them for bond alone would be sufficient.
But as ethtools -k reports flags differently for the real nics and bond, i guess it has to be set atlest on the phyiscal interfaces.

iface enp33s0f1 inet manual
offload-rxvlan off
offload-txvlan off
offload-tso off
offload-rx-vlan-filter off

iface enp33s0f2 inet manual
offload-rxvlan off
offload-txvlan off
offload-tso off
offload-rx-vlan-filter off

auto bond0
iface bond0 inet static
address 172.16.0.11
netmask 255.255.255.0
bond-slaves enp33s0f1 enp33s0f2
bond-miimon 100
bond-mode broadcast
offload-rxvlan off
offload-txvlan off
offload-tso off
offload-rx-vlan-filter off

Now problem is gone here.

sigmarb · Jan 9, 2022

Looks like this did not finally solve my problem. Anyone else an idea how to fix this permanently?

dirk_soldea · Jan 26, 2022

Have the same issue with the Supermicro X11SDV-4C-TP8F on Proxmox 7.1.7

sigmarb · Feb 6, 2022

dirk_soldea said:
Have the same issue with the Supermicro X11SDV-4C-TP8F on Proxmox 7.1.7

Do you only see the error in logs or does your system completely freeze for 20 minutes?

sigmarb · Feb 6, 2022

Here are some observations i've made. Maybe others can relate:

After rebooting host1, also host3 looses all of it's links according to KNET. These are independent bonds in my case. The links itself did not go down. I still had running pings over these links. This must be a problem of some higher or lower level.

Code:

Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] link: host: 1 link: 0 is down
Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] link: host: 1 link: 1 is down
Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] host: host: 1 has no active links
Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
Feb  6 22:55:14 pve4 corosync[3599]:   [KNET  ] host: host: 1 has no active links
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] link: host: 3 link: 0 is down
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] link: host: 3 link: 1 is down
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] host: host: 3 has no active links
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb  6 22:55:21 pve4 corosync[3599]:   [KNET  ] host: host: 3 has no active links
Feb  6 22:55:22 pve4 corosync[3599]:   [KNET  ] rx: host: 3 link: 0 is up
Feb  6 22:55:22 pve4 corosync[3599]:   [KNET  ] rx: host: 3 link: 1 is up

Even though pings going forth and back, ceph complains:

Code:

Feb  6 23:05:46 pve4 ceph-osd[362861]: 2022-02-06T23:05:46.906+0100 7fac59de1700 -1 osd.9 12160 heartbeat_check: no reply from 172.16.0.13:6812 osd.12 since back 2022-02-06T22:55:17.509263+0100 front 2022-02-06T22:55:17.509285+0100 (oldest deadline 2022-02-06T22:55:38.609106+0100)

dirk_soldea · Feb 7, 2022

sigmarb said:
Do you only see the error in logs or does your system completely freeze for 20 minutes?

I only see the error in log

xed · Feb 19, 2022

This is actually keeping us from configuring a new PVE host successfully. The M/B is a Supermicro one with a Xeon Silver gen2, and two 10G SFP+ NICs, 710 based.

It fails with VLANs configured.

xed · Feb 19, 2022

UPDATE: 722 based, not 710.

Lanschuetz · Nov 30, 2022

Hi,
any news on this topic?
Also working with Intel Xeon Silver CPU and 2 10g SFP+ NICs based on Intel X710

Vandros · Nov 27, 2023

Same here.
PVE 8.1.3, Supermicro X11DPi-NT, Intel X722, vlan aware linux bridge:
kernel: i40e 0000:b5:00.1: Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF
https://forum.proxmox.com/threads/e...-overflow-promiscuous-on-pf.62875/post-435354 - did not help.

Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

Well-Known Member

Well-Known Member

Member

Member

Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Active Member

Renowned Member

Renowned Member

New Member

Renowned Member

Renowned Member

New Member

Active Member

Active Member

New Member

Active Member

We value your privacy