Toggling NICs with x710 / x722 and i40e

ewuewu

Renowned Member
Sep 14, 2010
58
0
71
Hamburg
Hello Forum,

i've upgraded an older Proxmox/Ceph Cluster from 5 to 6 and finaly to 7.2-11. Since them I experienced some trouble with the NIcs (on every node). They are going up and down with different frequency. The behavior affects all NICs.

Is there any know solution how to solve this behaviour?

All three nodes consist of identical hardware:
Supermicro x11DPi-NT (two x722 Onboard NICs) - FW-Vers.: 3.33
one X710/x557 Quad Port NIC - FW-Vers.: 9.00

All Nodes are connected via 3 Neatgear XS716T

Extract from the syslog of a server:
Oct 4 10:04:56 pmx2 pmxcfs[2782]: [status] notice: received log
Oct 4 10:06:30 pmx2 kernel: [68959.374679] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
Oct 4 10:06:35 pmx2 kernel: [68963.812105] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Oct 4 10:06:35 pmx2 pvedaemon[1420393]: <root@pam> successful auth for user 'root@pam'
Oct 4 10:06:39 pmx2 pveproxy[2110817]: Clearing outdated entries from certificate cache
Oct 4 10:08:18 pmx2 systemd[1]: Started Session 35 of user root.
Oct 4 10:08:23 pmx2 kernel: [69071.829129] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
Oct 4 10:08:28 pmx2 kernel: [69076.475286] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Oct 4 10:09:19 pmx2 kernel: [69128.005873] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
Oct 4 10:09:24 pmx2 pveproxy[2069365]: Clearing outdated entries from certificate cache
Oct 4 10:09:50 pmx2 kernel: [69159.233385] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None

Extract from dmesg of a server:
[Tue Oct 4 08:23:05 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Down
[Tue Oct 4 08:23:05 2022] vmbr0: port 1(enp24s0f0) entered disabled state
[Tue Oct 4 08:23:09 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:23:09 2022] vmbr0: port 1(enp24s0f0) entered blocking state
[Tue Oct 4 08:23:09 2022] vmbr0: port 1(enp24s0f0) entered forwarding state
[Tue Oct 4 08:24:26 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Down
[Tue Oct 4 08:24:26 2022] vmbr0: port 1(enp24s0f0) entered disabled state
[Tue Oct 4 08:24:30 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:24:30 2022] vmbr0: port 1(enp24s0f0) entered blocking state
[Tue Oct 4 08:24:30 2022] vmbr0: port 1(enp24s0f0) entered forwarding state
[Tue Oct 4 08:25:47 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Down
[Tue Oct 4 08:25:47 2022] vmbr0: port 1(enp24s0f0) entered disabled state
[Tue Oct 4 08:25:51 2022] i40e 0000:18:00.0 enp24s0f0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:25:51 2022] vmbr0: port 1(enp24s0f0) entered blocking state
[Tue Oct 4 08:25:51 2022] vmbr0: port 1(enp24s0f0) entered forwarding state
[Tue Oct 4 08:34:02 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 08:34:06 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:48:54 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 08:48:59 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:49:50 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 08:49:55 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 08:51:39 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 08:52:10 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 09:55:25 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 09:55:29 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 10:06:31 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 10:06:35 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 10:08:23 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 10:08:28 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
[Tue Oct 4 10:09:20 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Down
[Tue Oct 4 10:09:51 2022] i40e 0000:18:00.1 enp24s0f1: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None

lspci -v -s 18:00.0
18:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 01)
Subsystem: Intel Corporation Ethernet Converged Network Adapter X710-T4
Flags: bus master, fast devsel, latency 0, IRQ 34, NUMA node 0
Memory at 384000000000 (64-bit, prefetchable) [size=16M]
Memory at 384004800000 (64-bit, prefetchable) [size=32K]
Expansion ROM at aae80000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [e0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 48-54-7a-ff-ff-fe-fd-3c
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Capabilities: [1a0] Transaction Processing Hints
Capabilities: [1b0] Access Control Services
Capabilities: [1d0] Secondary PCI Express
Kernel driver in use: i40e
Kernel modules: i40e

lspci -v -s 60:00.0
60:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 09)
DeviceName: Intel Ethernet X722 #1
Subsystem: Super Micro Computer Inc Ethernet Connection X722 for 10GBASE-T
Flags: bus master, fast devsel, latency 0, IRQ 31, NUMA node 0
Memory at 38c000000000 (64-bit, prefetchable) [size=16M]
Memory at 38c002800000 (64-bit, prefetchable) [size=32K]
Expansion ROM at c5d00000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 80-59-ba-ff-ff-6b-1f-ac
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Capabilities: [1a0] Transaction Processing Hints
Capabilities: [1b0] Access Control Services
Kernel driver in use: i40e
Kernel modules: i40e

ethtool -i enp24s0f1
driver: i40e
version: 5.15.53-1-pve
firmware-version: 9.00 0x8000cec0 1.3179.0
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
root@pmx2:~#

System:
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve) pve-manager: 7.2-11 (running version: 7.2-11/b76d3178) pve-kernel-helper: 7.2-12 pve-kernel-5.15: 7.2-10 pve-kernel-5.4: 6.4-19 pve-kernel-5.15.53-1-pve: 5.15.53-1 pve-kernel-5.4.195-1-pve: 5.4.195-1 pve-kernel-4.15: 5.4-19 pve-kernel-4.15.18-30-pve: 4.15.18-58 pve-kernel-4.15.18-12-pve: 4.15.18-36 ceph: 15.2.17-pve1 ceph-fuse: 15.2.17-pve1 corosync: 3.1.5-pve2 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown: not correctly installed ifupdown2: 3.1.0-1+pmx3 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.24-pve1 libproxmox-acme-perl: 1.4.2 libproxmox-backup-qemu0: 1.3.1-1 libpve-access-control: 7.2-4 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.2-2 libpve-guest-common-perl: 4.1-2 libpve-http-server-perl: 4.1-3 libpve-storage-perl: 7.2-8 libqb0: 1.0.5-1 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 5.0.0-3 lxcfs: 4.0.12-pve1 novnc-pve: 1.3.0-3 proxmox-backup-client: 2.2.6-1 proxmox-backup-file-restore: 2.2.6-1 proxmox-mini-journalreader: 1.3-1 proxmox-widget-toolkit: 3.5.1 pve-cluster: 7.2-2 pve-container: 4.2-2 pve-docs: 7.2-2 pve-edk2-firmware: 3.20220526-1 pve-firewall: 4.2-6 pve-firmware: 3.5-2 pve-ha-manager: 3.4.0 pve-i18n: 2.7-2 pve-qemu-kvm: 7.0.0-3 pve-xtermjs: 4.16.0-1 qemu-server: 7.2-4 smartmontools: 7.2-pve3 spiceterm: 3.2-2 swtpm: 0.7.1~bpo11+1 vncterm: 1.7-1 zfsutils-linux: 2.1.5-pve1

/etc/network/interfaces
auto lo
iface lo inet loopback

auto enp24s0f0
iface enp24s0f0 inet manual

auto enp24s0f1
iface enp24s0f1 inet manual

auto enp24s0f2
iface enp24s0f2 inet manual

auto enp24s0f3
iface enp24s0f3 inet manual
mtu 1500

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.100.242/24
gateway 192.168.100.254
bridge-ports enp24s0f0
bridge-stp off
bridge-fd 0


auto vmbr1
iface vmbr1 inet static
address 192.168.200.242/24
bridge-ports eno2
bridge-stp off
bridge-fd 0
mtu 1500
#ceph

auto vmbr2
iface vmbr2 inet static
address 192.168.50.242/24
bridge-ports eno1
bridge-stp off
bridge-fd 0


auto vmbr3
iface vmbr3 inet static
address 192.168.201.2/24
bridge-ports enp24s0f3
bridge-stp off
bridge-fd 0
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!