Intel i226-V with MTU 9014 crashes Proxmox boots

RudyBzh

Member
Jul 9, 2020
18
1
23
44
Hi,

Following my previous post, here, where I described the crash and a way to recover.
I manage to isolate the issue and can reproduce it when I want.

It seems it's related to the network driver for Intel i226-V : igd (Don't know who has to be notified between Proxmox / Debian / Intel / ... ?!)

If I just set "mtu 9014" for my i226-V network interface, Proxmox will not boot anymore, and will crash each time, needing a hard reset, and a recover using USB installer (as explained in me previous post).
The interface has not even to be used by Proxmox (not in vmbr0).

The issue is in 6.2.16-19-pve and also in 6.5.11-3-pve.

proxmox-ve: 8.0.2 (running kernel: 6.5.11-3-pve)
pve-manager: 8.0.9 (running version: 8.0.9/fd1a0ae1b385cdcd)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.5
proxmox-kernel-6.5: 6.5.11-3
proxmox-kernel-6.5.11-3-pve: 6.5.11-3
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.10
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.4
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.5
proxmox-mail-forward: 0.2.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.1
pve-cluster: 8.0.5
pve-container: 5.0.6
pve-docs: 8.0.5
pve-edk2-firmware: 4.2023.08-1
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.0.7
pve-qemu-kvm: 8.1.2-3
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.8
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve3

04:00.0 Ethernet controller: Aquantia Corp. AQC113CS NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 03)
05:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-V (rev 06)

driver: igc
version: 6.5.11-3-pve
firmware-version: 2017:888d
expansion-rom-version:
bus-info: 0000:05:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual
mtu 9014
#Interface 10G

auto eno2
iface eno2 inet manual
#Interface 2.5G

auto bond0
iface bond0 inet manual
bond-slaves eno1
bond-miimon 100
bond-mode active-backup
bond-primary eno1
mtu 9014

auto vmbr0
iface vmbr0 inet static
address 192.168.1.4/24
gateway 192.168.1.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
mtu 9014

Only difference is setting "mtu 9014" on unused eno2 and rebooting (without touching anything other, like mtu on eno1 or bond0).

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual
mtu 9014
#Interface 10G

auto eno2
iface eno2 inet manual
mtu 9014
#Interface 2.5G

auto bond0
iface bond0 inet manual
bond-slaves eno1
bond-miimon 100
bond-mode active-backup
bond-primary eno1
mtu 9014

auto vmbr0
iface vmbr0 inet static
address 192.168.1.4/24
gateway 192.168.1.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
mtu 9014

"Just" have to remove "mtu 9014" on eno2 to reboot normaly proxmox.
But it's a little long because must rescue via USB installer, import zpool (OS), modify interfaces, export zpool.

I did not find this specific case looking on the forum (but several cases about this i226 network card... and its driver).

I can survive without it, but it should be adressed I think.
Exception is : is there something I'm doing wrong ?

Thanks.
 
Hi !

I'm having a very similar issue with i225-V (igc driver).

In fact I can use Jumbo frames, but only if I turn off autonegotiation on the card with "ethtool -s enpxs0 speed 1000 duplex full autoneg off".

In kernel 6.5, this does not even work anymore, and I had to roll back to 6.2.16-19.
 
  • Like
Reactions: Kingneutron
Hello all,

Since my firs thread, I moved my network to 2.5 Gb/s, and tried to get back to kernel 6.5. Jumbo frames just won't work in that scenario.
I tried kernel 6.2.16-20 with no luck either.

Rolling back to 6.2.16-19 and manually forcing the port with ethtool will resume a normal behavior.

Some bug was défintely introduced since 6.2.16-20 and above with i226-V chipset.
My kernel bug is still open and not assigned.
 
Hi all,

just giving an update. I still can not have julmbo frames running on any kernel above 6.2.16-19. I just tried 6.8.4-3 today, with no luck.
 
Hi all,
Faced this issue too. OS just freezes for in case of high MTU for network interface. BMC card of the server logs "CPU_CATERR" (catastrophic error) when it happens.
The server:
  • CPU - Intel i5-14600k
  • MB - ASUS PRO W680M SE
  • RAM - 196GB DDR5 Kingston Fury
  • PSU - Corsair RM850x
  • NVMe - 2 * WD RED S700 512GiB
  • OS - Proxmox 8.2 (Linux kernel 6.8.4)
I thought this problem with my motherboard and wrote about the problem to Asus support. I'm waiting a response. And I had to switch to MTU 1500 for data subnet with 2.5Gbps devices, works just fine. But I didn't benchmark with high MTU and 1500 MTU, sorry.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!