[SOLVED] AER: Multiple Uncorrectable (Non-Fatal) error message received AOC-STGN-i2S/X520

Etrai

New Member
Oct 5, 2025
2
0
1
Hello PVE community!

New to the forum, but I've been running PVE for - a while now in my homelab. Since the beginning of the year I think.

Until today I had a much older (cirka 2018) Windows Server 2016 machine running Hyper-V and a bunch of VMs that have now (finally) been migrated to the PVE instance, currently running on 8.4.1. The until-today Windows host (i7-8700k, Asus Prime H370M-Plus) is now transferred into a new case and has been outfitted with better OS/VM storage and, of course, PVE. The issue I'm having is with my dual 10 Gb SFP+ card. I have one in each server, and it has worked flawlessly both under Windows and PVE 8. But under PVE 9 I get the following message;
Code:
[ 1233.988693] pcieport 0000:00:1c.4: AER: device recovery successful
[ 1233.990010] pcieport 0000:00:1c.4: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:04:00.0
[ 1233.991326] ixgbe 0000:04:00.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Completer ID)
[ 1233.992641] ixgbe 0000:04:00.0:   device [8086:10fb] error status/mask=00008000/00000000
[ 1233.993935] ixgbe 0000:04:00.0:    [15] CmpltAbrt              (First)
[ 1233.995239] ixgbe 0000:04:00.0: AER:   TLP Header: 0x60000010 0x000000ff 0x00000040 0x10305440
[ 1234.106687] pcieport 0000:00:1c.4: AER: device recovery successful
[ 1234.107993] pcieport 0000:00:1c.4: AER: Multiple Uncorrectable (Non-Fatal) error message received from 0000:04:00.0
[ 1234.109325] ixgbe 0000:04:00.1: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Completer ID)
[ 1234.110605] ixgbe 0000:04:00.1:   device [8086:10fb] error status/mask=00008000/00000000
[ 1234.111883] ixgbe 0000:04:00.1:    [15] CmpltAbrt              (First)
[ 1234.113176] ixgbe 0000:04:00.1: AER:   TLP Header: 0x60000010 0x000000ff 0x00000040 0x10385440
04:00.0 and 04:00.1 are the respective SFP+ ports on the card. The cards are Supermicro AOC-STGN-i2S (Intel 82599ES, same chip as X520 I believe).

I've searched around and the closest I've come to a possible solution is to add pcie_aspm=off to GRUB_CMDLINE_LINUX_DEFAULT. It did nothing, which is not very surprising considering ethtool -i just spits out Cannot get driver information: No such device. This makes me suspect that it's an issue with the ixgbe driver, but I have been known to jump to conclusions at times

I just installed PVE a few hours ago and I've done a post install update to PVE 9.0.10, kernel 6.14.11-3-pve.

Additional info:
lspci -v output:
Code:
04:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
        Subsystem: Super Micro Computer Inc AOC-STGN-i2S
        Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 12
        Memory at 4010300000 (64-bit, prefetchable) [size=512K]
        I/O ports at 3020 [disabled] [size=32]
        Memory at 4010400000 (64-bit, prefetchable) [size=16K]
        Expansion ROM at 91380000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, IntMsgNum 0
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number ac-1f-6b-ff-ff-2d-f2-78
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

04:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
        Subsystem: Super Micro Computer Inc AOC-STGN-i2S
        Flags: fast devsel, IRQ 17, IOMMU group 13
        Memory at 4010380000 (64-bit, prefetchable) [size=512K]
        I/O ports at 3000 [disabled] [size=32]
        Memory at 4010604000 (64-bit, prefetchable) [size=16K]
        Expansion ROM at 91300000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, IntMsgNum 0
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number ac-1f-6b-ff-ff-2d-f2-78
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

modfinfo ixgbe output:
Code:
filename:       /lib/modules/6.14.11-3-pve/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
license:        GPL v2
description:    Intel(R) 10 Gigabit PCI Express Network Driver
srcversion:     C43D25E5E1645EE61D1E81D
alias:          pci:v00008086d000057B2sv*sd*bc*sc*i*
[snipped a bunch of aliases, the one below matching the AOC-STGN-i2S according to lspci -nn]
alias:          pci:v00008086d000010FBsv*sd*bc*sc*i*
[snipped a bunch of aliases]
alias:          pci:v00008086d000010B6sv*sd*bc*sc*i*
depends:        dca,xfrm_algo,mdio
intree:         Y
name:           ixgbe
retpoline:      Y
vermagic:       6.14.11-3-pve SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        25:BC:5E:B0:4C:FE:47:05:96:77:AF:B9:29:5A:8A:94:8D:0B:C8:6A
sig_hashalgo:   sha512
signature:      06:74:D0:76:1C:F4:53:C9:17:07:F9:3B:9F:53:AD:70:22:FF:C3:FA:
                07:A1:CA:28:A2:E3:A3:AC:96:9F:BD:5B:46:D5:31:4D:81:C4:AA:E0:
                ED:CB:31:8C:62:E6:97:15:A4:E1:2F:FA:C6:4F:D7:55:18:C0:5C:12:
                79:05:0B:6E:02:AC:EC:12:7D:2F:DD:2A:7D:0D:28:C5:5A:50:21:91:
                89:7F:55:4D:F3:A5:14:A6:9C:57:F0:5E:4A:DE:2B:E1:33:9C:64:96:
                2D:25:2E:31:E7:47:8E:60:6C:EA:9E:96:96:6C:90:E4:AB:20:B3:39:
                53:39:D5:73:BA:9D:8C:0B:C9:0D:A2:1A:7F:8A:78:C7:39:E6:0B:FF:
                F5:A6:A2:B4:E1:21:24:C6:EC:9A:EC:A6:B0:29:CE:8D:2F:FF:23:41:
                3A:D8:F0:57:D4:4E:7E:8C:D1:CD:5F:46:74:ED:6E:D0:B2:93:DF:FA:
                F3:8E:D3:2E:8A:6D:45:72:21:2E:13:90:D6:00:68:4E:92:5E:1A:C7:
                BD:0B:D3:DC:C8:2B:DA:DE:F4:71:E6:4B:26:E6:88:A7:66:4D:5E:DD:
                33:1C:C8:22:85:3D:36:79:5B:B6:A8:0B:FE:EA:94:53:35:14:89:B9:
                43:B5:7A:4F:A8:66:CE:57:2E:DE:C9:3D:60:83:29:06:D5:1B:54:DF:
                52:1A:94:3E:6C:0C:6D:53:12:84:93:C1:1B:F6:5C:8B:1B:F0:AB:9E:
                EB:2D:A8:86:D0:24:43:BF:05:BC:B2:A1:0F:41:46:01:56:8F:AC:C4:
                7C:F4:F0:46:21:43:62:16:AC:C4:67:F2:CD:00:14:AA:BF:46:89:C0:
                41:B4:51:AC:83:CB:18:DA:A5:D7:9D:1A:E5:48:CE:F7:B5:84:37:6B:
                3C:41:CB:63:ED:BF:10:F1:37:E4:CD:0E:62:A9:11:39:89:08:07:4E:
                62:2C:36:0A:EF:65:97:91:18:81:3C:FC:84:39:CA:76:13:4C:EC:1C:
                1E:54:58:49:72:76:B7:38:EB:E8:89:CB:F3:0B:1D:9F:67:BC:92:48:
                A9:B4:63:3D:A4:D5:07:C0:D8:02:94:42:DB:24:38:E7:F3:FE:AD:BA:
                47:08:42:EA:BA:81:12:FC:8A:21:42:D3:4D:2C:45:2C:87:54:EC:B5:
                43:F9:8C:6D:75:39:76:99:98:A7:C6:FE:12:88:A7:4D:22:BF:3A:87:
                00:AF:03:A5:7A:46:6C:81:24:19:65:84:46:09:5C:31:39:21:68:B6:
                00:2B:9B:71:05:26:2B:0A:62:A7:26:26:1A:5D:20:45:07:18:4F:1E:
                A3:A2:BF:86:8D:C7:4F:6F:66:EF:AD:86
parm:           max_vfs:Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63. (Deprecated) (uint)
parm:           allow_unsupported_sfp:Allow unsupported and untested SFP+ modules on 82599-based adapters (bool)
parm:           debug:Debug level (0=none,...,16=all) (int)
 
So - it turns out that the NIC just didn't work in the port I put it. I had to put it in the first PCIe slot for it not to spew errors all over. So I did the ole switcharoo and put the HBA in the other x16 (@x4) slot, which is a bit unfortunate because it's right up against the side of the chassis. Luckily there is a grill there and juuust enough space to actually jam a 40x10 mm fan in there. That is now friction fit into place with the hopes that it will be about to breath well enough to not have the HBA overheat.

Cheers!