ixgbe initialize fails on proxmox 6.2

devopstales

Active Member
Mar 16, 2019
13
1
43
34
devopstales.github.io
I have two 10G ports that is trying to use ixgbe driver to load but one of them is failing.

Bash:
# dmesg |grep ixgbe
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.41-1-pve root=/dev/mapper/pve-root ro ixgbe.allow_unsupported_sfp=1 ixgbe.allow_unsupported_sfp=1,1 quiet
[    0.922451] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.41-1-pve root=/dev/mapper/pve-root ro ixgbe.allow_unsupported_sfp=1 ixgbe.allow_unsupported_sfp=1,1 quiet
[    3.653907] ixgbe: loading out-of-tree module taints kernel.
[    3.653909] ixgbe: loading out-of-tree module taints kernel.
[    3.694915] ixgbe: 0000:04:00.0: ixgbe_check_options: FCoE Offload feature enabled
[    3.694917] ixgbe: allow_unsupported_sfp Enabled
[    3.704986] ixgbe 0000:04:00.0: failed to load because an unsupported SFP+ or QSFP module type was detected.
[    3.705041] ixgbe 0000:04:00.0: Reload the driver after installing a supported module.
[    3.705479] ixgbe: 0000:04:00.1: ixgbe_check_options: FCoE Offload feature enabled
[    3.705479] ixgbe: allow_unsupported_sfp Enabled
[    3.886806] ixgbe 0000:04:00.1: Multiqueue Enabled: Rx Queue count = 24, Tx Queue count = 24 XDP Queue count = 0
[    3.888704] ixgbe 0000:04:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
[    3.888788] ixgbe 0000:04:00.1 eth0: MAC: 2, PHY: 14, SFP+: 4, PBA No: 400900-000
[    3.888790] ixgbe 0000:04:00.1: 0c:c4:7a:bb:c1:a9
[    3.888793] ixgbe 0000:04:00.1 eth0: Enabled Features: RxQ: 24 TxQ: 24 FdirHash
[    3.888845] ixgbe 0000:04:00.1 eth0: Intel(R) 10 Gigabit Network Connection
[    3.890010] ixgbe 0000:04:00.1 enp4s0f1: renamed from eth0
[  101.076943] ixgbe 0000:04:00.1: registered PHC device on enp4s0f1
[  101.258225] ixgbe 0000:04:00.1 enp4s0f1: detected SFP+: 4
[  101.398268] ixgbe 0000:04:00.1 enp4s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

As you can see I tried the allow_unsupported_sfp option. My network cards:

Bash:
# lspci |grep Eth
04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
07:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

I tryed to upgrade to the latest ixgbe driver from intels website, but same problem:
https://downloadcenter.intel.com/do...bit-Ethernet-Network-Connections-Under-Linux-
 
As you can see I tried the allow_unsupported_sfp option. My network cards:

try removing the ixgbe.allow_unsupported_sfp=1,1 from the kernel command-line

AFAIR the intel modules did change a few times how they wanted to get this enabled - also there was a difference between the in-tree (which PVE ships) and the out-of-tree module.

try reproducing the issue with the in-tree module (i.e. remove the manually compiled one)

I hope this helps!
 
What was the problem with the in-tree driver, which did go away with the out-of-tree one?
what's the output of `ip link`?
do the nics show up correctly if you start the system without sfp+ modules installed (just for verifying that it's a sfp issue)
 
@Stoiko Ivanov The proble is the same with the in-tree driver ant the out-of-tree driver. One of the two interface did not working.

Bash:
[    3.694915] ixgbe: 0000:04:00.0: ixgbe_check_options: FCoE Offload feature enabled
[    3.694917] ixgbe: allow_unsupported_sfp Enabled
[    3.704986] ixgbe 0000:04:00.0: failed to load because an unsupported SFP+ or QSFP module type was detected.
[    3.705041] ixgbe 0000:04:00.0: Reload the driver after installing a supported module.

This is the output of the `ip link`
Bash:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: rename2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:25:90:da:3a:6c brd ff:ff:ff:ff:ff:ff
3: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:25:90:da:3a:6d brd ff:ff:ff:ff:ff:ff
4: enp4s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:bb:c1:a9 brd ff:ff:ff:ff:ff:ff
5: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:bb:c1:a9 brd ff:ff:ff:ff:ff:ff
6: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether aa:9c:e2:e1:b4:23 brd ff:ff:ff:ff:ff:ff

I didn't testid without the intel sfp+ modul, besause it is in a data-center.
 
seem there's also a problem with the igb NICS:
rename2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 00:25:90:da:3a:6c brd ff:ff:ff:ff:ff:ff

the rename names happen sometimes - try upgrading the BIOS

I would probably try the following:
* removing the kernel module (rmmod ixgbe)
* inserting the module on the cli and trying to provide 'allow_unsupported_sfp=1' or 'allow_unsupported_sfp=1,1' (once each)

finally - if possible do try to get some supported sfps - I had the experience that the module-option stopped working from one kernel to the next (while a box was in production) - was not worth the trouble for the savings