Anyone with Intel X710-T network card with reboot history on Dell R540 server?

andrelfm

Renowned Member
Nov 20, 2015
4
0
66
A few days ago there was an unexpected reboot of the Dell R540 server, but no error was recorded in the SYSLOG, and after contacting Dell support, it was informed that the reboot was caused by an event related to the Intel X710-T network interface of a possible driver error. Someone with a similar situation? Thanks.

SYSLOG
Sep 15 13:18:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:18:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:18:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
Sep 15 13:19:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:19:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:19:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
Sep 15 13:20:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:20:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:20:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
Sep 15 13:21:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:21:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:21:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
Sep 15 13:22:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:22:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:22:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
Sep 15 13:23:00 pve-test systemd[1]: Starting Proxmox VE replication runner...
Sep 15 13:23:00 pve-test systemd[1]: pvesr.service: Succeeded.
Sep 15 13:23:00 pve-test systemd[1]: Finished Proxmox VE replication runner.
-- Reboot --
Sep 15 13:25:39 pve-test kernel: Linux version 5.11.22-4-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.11.22-8 (Fri, 27 Aug 2021 11:51:34 +0200) ()
Sep 15 13:25:39 pve-test kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.11.22-4-pve root=/dev/mapper/pve-root ro quiet
Sep 15 13:25:39 pve-test kernel: KERNEL supported cpus:
Sep 15 13:25:39 pve-test kernel: Intel GenuineIntel
Sep 15 13:25:39 pve-test kernel: AMD AuthenticAMD
Sep 15 13:25:39 pve-test kernel: Hygon HygonGenuine
Sep 15 13:25:39 pve-test kernel: Centaur CentaurHauls
Sep 15 13:25:39 pve-test kernel: zhaoxin Shanghai
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
Sep 15 13:25:39 pve-test kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
Sep 15 13:25:39 pve-test kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
Sep 15 13:25:39 pve-test kernel: x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
Sep 15 13:25:39 pve-test kernel: x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
Sep 15 13:25:39 pve-test kernel: x86/fpu: xstate_offset[5]: 960, xstate_sizes[5]: 64

PVEVERSION
root@pve-test:~# pveversion --verbose
proxmox-ve: 7.0-2 (running kernel: 5.11.22-4-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-5.11: 7.0-7
pve-kernel-helper: 7.0-7
pve-kernel-5.11.22-4-pve: 5.11.22-8
pve-kernel-5.11.22-3-pve: 5.11.22-7
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.5-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-6
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-11
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-9
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-3
pve-firmware: 3.3-1
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-13
smartmontools: 7.2-1
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1

LSPCI
root@pve-test:~# lspci
...
65:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
65:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
65:00.2 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
65:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02)
...

MODINFO
root@pve-test:~# modinfo i40e
filename: /lib/modules/5.11.22-4-pve/kernel/drivers/net/ethernet/intel/i40e/i40e.ko
license: GPL v2
description: Intel(R) Ethernet Connection XL710 Network Driver
author: Intel Corporation, <e1000-devel@lists.sourceforge.net>
srcversion: 3882B75411DC3CC31437953
alias: pci:v00008086d0000158Bsv*sd*bc*sc*i*
alias: pci:v00008086d0000158Asv*sd*bc*sc*i*
alias: pci:v00008086d00000D58sv*sd*bc*sc*i*
alias: pci:v00008086d00000CF8sv*sd*bc*sc*i*
alias: pci:v00008086d00001588sv*sd*bc*sc*i*
alias: pci:v00008086d00001587sv*sd*bc*sc*i*
alias: pci:v00008086d000037D3sv*sd*bc*sc*i*
alias: pci:v00008086d000037D2sv*sd*bc*sc*i*
alias: pci:v00008086d000037D1sv*sd*bc*sc*i*
alias: pci:v00008086d000037D0sv*sd*bc*sc*i*
alias: pci:v00008086d000037CFsv*sd*bc*sc*i*
alias: pci:v00008086d000037CEsv*sd*bc*sc*i*
alias: pci:v00008086d0000104Fsv*sd*bc*sc*i*
alias: pci:v00008086d0000104Esv*sd*bc*sc*i*
alias: pci:v00008086d000015FFsv*sd*bc*sc*i*
alias: pci:v00008086d00001589sv*sd*bc*sc*i*
alias: pci:v00008086d00001586sv*sd*bc*sc*i*
alias: pci:v00008086d00001585sv*sd*bc*sc*i*
alias: pci:v00008086d00001584sv*sd*bc*sc*i*
alias: pci:v00008086d00001583sv*sd*bc*sc*i*
alias: pci:v00008086d00001581sv*sd*bc*sc*i*
alias: pci:v00008086d00001580sv*sd*bc*sc*i*
alias: pci:v00008086d00001574sv*sd*bc*sc*i*
alias: pci:v00008086d00001572sv*sd*bc*sc*i*
depends:
retpoline: Y
intree: Y
name: i40e
vermagic: 5.11.22-4-pve SMP mod_unload modversions
parm: debug:Debug level (0=none,...,16=all), Debug mask (0x8XXXXXXX) (uint)

ETHTOOL
root@pve-test:~# ethtool -i enp101s0f0
driver: i40e
version: 5.11.22-4-pve
firmware-version: 8.15 0x800096c6 20.0.17
expansion-rom-version:
bus-info: 0000:65:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
 
DMESG
root@pve-test:~# dmesg | grep -i i40e
[ 2.467072] i40e: Intel(R) Ethernet Connection XL710 Network Driver
[ 2.467074] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[ 2.483709] i40e 0000:65:00.0: fw 8.815.63341 api 1.12 nvm 8.15 0x800096c6 20.0.17 [8086:1589] [8086:0003]
[ 2.483715] i40e 0000:65:00.0: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver.
[ 2.747021] i40e 0000:65:00.0: MAC address: 3c:fd:fe:86:75:50
[ 2.747163] i40e 0000:65:00.0: FW LLDP is enabled
[ 2.753569] i40e 0000:65:00.0: PCI-Express: Speed 8.0GT/s Width x8
[ 2.754688] i40e 0000:65:00.0: Features: PF-id[0] VFs: 32 VSIs: 34 QP: 16 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 2.775546] i40e 0000:65:00.1: fw 8.815.63341 api 1.12 nvm 8.15 0x800096c6 20.0.17 [8086:1589] [8086:0000]
[ 2.775550] i40e 0000:65:00.1: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver.
[ 3.014528] i40e 0000:65:00.1: MAC address: 3c:fd:fe:86:75:51
[ 3.014765] i40e 0000:65:00.1: FW LLDP is enabled
[ 3.027646] i40e 0000:65:00.1: PCI-Express: Speed 8.0GT/s Width x8
[ 3.029213] i40e 0000:65:00.1: Features: PF-id[1] VFs: 32 VSIs: 34 QP: 16 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 3.041389] i40e 0000:65:00.2: fw 8.815.63341 api 1.12 nvm 8.15 0x800096c6 20.0.17 [8086:1589] [8086:0000]
[ 3.041394] i40e 0000:65:00.2: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver.
[ 3.278412] i40e 0000:65:00.2: MAC address: 3c:fd:fe:86:75:52
[ 3.278553] i40e 0000:65:00.2: FW LLDP is enabled
[ 3.358480] i40e 0000:65:00.2: PCI-Express: Speed 8.0GT/s Width x8
[ 3.359602] i40e 0000:65:00.2: Features: PF-id[2] VFs: 32 VSIs: 34 QP: 16 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 3.371904] i40e 0000:65:00.3: fw 8.815.63341 api 1.12 nvm 8.15 0x800096c6 20.0.17 [8086:1589] [8086:0000]
[ 3.371908] i40e 0000:65:00.3: The driver for the device detected a newer version of the NVM image v1.12 than expected v1.9. Please install the most recent version of the network driver.
[ 3.607106] i40e 0000:65:00.3: MAC address: 3c:fd:fe:86:75:53
[ 3.607246] i40e 0000:65:00.3: FW LLDP is enabled
[ 3.612037] i40e 0000:65:00.3 eth3: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
[ 3.612781] i40e 0000:65:00.3: PCI-Express: Speed 8.0GT/s Width x8
[ 3.613902] i40e 0000:65:00.3: Features: PF-id[3] VFs: 32 VSIs: 34 QP: 16 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[ 3.615109] i40e 0000:65:00.0 enp101s0f0: renamed from eth1
[ 3.647066] i40e 0000:65:00.2 enp101s0f2: renamed from eth0
[ 3.691202] i40e 0000:65:00.1 enp101s0f1: renamed from eth2
[ 3.723052] i40e 0000:65:00.3 enp101s0f3: renamed from eth3
[ 8.864937] i40e 0000:65:00.2 enp101s0f2: set new mac address b6:49:b3:c7:c3:43
[ 8.872912] i40e 0000:65:00.3 enp101s0f3: set new mac address b6:49:b3:c7:c3:43
[ 8.901358] i40e 0000:65:00.2: entering allmulti mode.
[ 8.904007] i40e 0000:65:00.3: entering allmulti mode.
[ 8.937667] i40e 0000:65:00.0 enp101s0f0: set new mac address 02:75:10:a4:9d:99
[ 8.945268] i40e 0000:65:00.1 enp101s0f1: set new mac address 02:75:10:a4:9d:99

Hi guys, analyzing the dmesg I noticed the driver update request. Could this pending be causing the unexpected reboot? Could anyone help how to perform this update? Or would it be better to wait for some update from proxmox? Thanks.
 
You can compile latest intel (i40e) drivers easily under proxmox, i done it without problem for x710-da2, i got same message before drivers update. Same thing for iavf drivers. Note you could need to upgrade to latest nvm firmware (8.40 currently) too.
 
Last edited:
  • Like
Reactions: andrelfm
Is there any way to get the actual Intel driver version Proxmox uses?

For all of my cards I only get the Proxmox version...

Code:
# ethtool -i enp2s0f0
driver: ixgbe
version: 5.11.22-4-pve
firmware-version: 0x800006da, 1.1824.0
 
You can compile latest intel (i40e) drivers easily under proxmox, i done it without problem for x710-da2, i got same message before drivers update. Same thing for iavf drivers. Note you could need to upgrade to latest nvm firmware (8.40 currently) too.
could you be so kind and point us at the instructions/package/etc. for the recompile and the nvm firmware?
 
could you be so kind and point us at the instructions/package/etc. for the recompile and the nvm firmware?

Required packages:
apt install pve-headers build-essential make

Update Drivers:
wget https://downloadmirror.intel.com/24411/eng/i40e-2.16.11.tar.gz tar -zxvf i40e-2.16.11.tar.gz cd ./i40e-2.16.11/src/ make make install

wget https://downloadmirror.intel.com/24693/eng/iavf-4.2.7.tar.gz tar -zxvf iavf-4.2.7.tar.gz cd ./iavf-4.2.7/src/ make make install

Checking update:
reboot dmesg | grep -i i40e modinfo i40e modinfo iavf

Firmware Update: directly from iDrac Dell

Friend ... I'm a layman on the subject and I'm waiting for more news about it, so I can't guarantee that this procedure will solve the problem ... but that's what I've achieved so far ... if anyone can validate the above information. I'll be thankful.

Attention: I noticed that with every kernel update, it will be necessary to compile the driver again.

Reverting update (in the source folder of each drive):
make uninstall reboot
 
Required packages:
apt install pve-headers build-essential make

Update Drivers:
wget https://downloadmirror.intel.com/24411/eng/i40e-2.16.11.tar.gz tar -zxvf i40e-2.16.11.tar.gz cd ./i40e-2.16.11/src/ make make install

wget https://downloadmirror.intel.com/24693/eng/iavf-4.2.7.tar.gz tar -zxvf iavf-4.2.7.tar.gz cd ./iavf-4.2.7/src/ make make install

Checking update:
reboot dmesg | grep -i i40e modinfo i40e modinfo iavf

Firmware Update: directly from iDrac Dell

Friend ... I'm a layman on the subject and I'm waiting for more news about it, so I can't guarantee that this procedure will solve the problem ... but that's what I've achieved so far ... if anyone can validate the above information. I'll be thankful.

Attention: I noticed that with every kernel update, it will be necessary to compile the driver again.

Reverting update (in the source folder of each drive):
make uninstall reboot
Is necessary to Reboot the machine in the case of compile only i40e driver ?

Remove the module and modprobe can be work ?
 
Is necessary to Reboot the machine in the case of compile only i40e driver ?

Remove the module and modprobe can be work ?
depends whether you are able to remove the module, ie. it's not in use
but for things networking, you do want to reboot