s1x

New Member
Sep 1, 2022
3
0
1
Looking for some assistance with this error.

I just installed Proxmox for the first time to give it a spin and I'm running into AER errors with my ASUS Hyper M.2 X16 PCIe 4.0 X4 Expansion Card that is populated with 4 NVMe drives. This is device_id: 0000:81:00.0. This is an Epyc system on a TYAN S8030GM4NE-2T board. There are no options in the BIOS to enable to disable AER from what I can tell.
I have tested adding pci=nommconf and pcie_aspm=off to grub with no success.
nano /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off pci=nommconf"
GRUB_CMDLINE_LINUX=""
update-grub
reboot

What does work is specifying GEN3 for the slot that the expansion card is plugged into.

Code:
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:  Error 12, type: corrected
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   section_type: PCIe error
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   port_type: 0, PCIe end point
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   version: 0.2
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   command: 0x0406, status: 0x0010
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   device_id: 0000:81:00.0
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   slot: 0
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   secondary_bus: 0x00
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   vendor_id: 0x1987, device_id: 0x5016
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   class_code: 010802
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:  Error 13, type: corrected
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   section_type: PCIe error
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   port_type: 0, PCIe end point
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   version: 0.2
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   command: 0x0406, status: 0x0010
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   device_id: 0000:81:00.0
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   slot: 0
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   secondary_bus: 0x00
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   vendor_id: 0x1987, device_id: 0x5016
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   class_code: 010802
Mar 01 21:18:14 test_server kernel: {15}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0:    [ 0] RxErr                  (First)
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Mar 01 21:18:14 test_server kernel: nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!