Proxmox VE shows errors and constantly reboots with 4 x WD_BLACK SN850X PCIe 4.0 NVMe 4TB drives connected via AORUS Gen4 AIC Adaptor

rigel.local

Member
Mar 24, 2023
31
5
8
Motherboard: Tyan S8030GM4NE-2T
CPU: AMD Epyc 7502p
Proxmox VE OS Boot : 2 x Intel 400GB S3710 MLC in mirror
PCIe 4.0 x16 to 4 M.2: AORUS Gen4 AIC Adaptor
NVMe M.2 drives: 4 x WD_BLACK SN850X PCIe 4.0 NVMe 4TB
PSU: 1200W

Video of errors: Proxmox VE command line errors

Screenshots of errors:
jpeg-optimizer_screenshot-2023-05-18-22-00-18@2x.pngjpeg-optimizer_screenshot-2023-05-18-22-00-00@2x.png


When I login to Proxmox VE and go to "Disks" I can see these WD_BLACK SN850X PCIe 4.0 NVMe 4TB drives are detected, but in the command line I see output of constant errors. After some time Proxmox VE reboots. I tested the same adapter but with 4 other different M.2 NVMe drives (Samsung) and there are no any problems.

screenshot-2023-05-18-21-45-45@2x.png

I can even check SMART info on these WD_BLACK SN850X PCIe 4.0 NVMe 4TB drives:
screenshot-2023-05-18-21-46-13@2x.png

When I tried to erase one of these problem drives, Proxmox VE reboots instantly.

Does anyone know what is going on? Is it a problem with Proxmox VE or it is a problem with motherboard and drives?

I wanted to passthrough 4 of these drives to a Windows VM.

BIOS settings:
screenshot-2023-05-19-10-40-35@2x.png
 
Last edited:
Hey, on one of your screenshots, i only see two of your four drives bugging.

Check your IOMMU groups, and see whos ssd are connected on 0000:80:03:1 and 0000:80:03:2
If you can check the output more descrively, verify if output screen give that on 0000:80:03:0 and 0000:80:03:3 (or 0000:80:03:4 and not 0 , i don't know how IOMMU initialise numeration group)
If other ports are not here, two ssd of your 4 are detected as problem

sincerely
 
Hey, on one of your screenshots, i only see two of your four drives bugging.

Check your IOMMU groups, and see whos ssd are connected on 0000:80:03:1 and 0000:80:03:2
If you can check the output more descrively, verify if output screen give that on 0000:80:03:0 and 0000:80:03:3 (or 0000:80:03:4 and not 0 , i don't know how IOMMU initialise numeration group)
If other ports are not here, two ssd of your 4 are detected as problem

sincerely

Thank you for your reply! However all 4 of these M.2 ssd drives are working great on another machine. On this machine in BIOS these 4 drives are detected (as part of the PCIe 4x4x4x4 bifubrication) and system POSTs without problems. Even in proxmox VE they are detected and you can do operations such as SMART info or wipe the disk, create ZFS. What these Proxmox errors actually mean?
 
Bash:
May 19 10:29:49 pve kernel: [  849.254349] AER: buffer overflow in recovery for 0000:40:03.3
May 19 10:29:49 pve kernel: [  849.254899] AER: buffer overflow in recovery for 0000:40:03.2
May 19 10:29:49 pve kernel: [  849.255448] AER: buffer overflow in recovery for 0000:40:03.3
May 19 10:29:49 pve kernel: [  849.255987] AER: buffer overflow in recovery for 0000:40:03.3
May 19 10:29:49 pve kernel: [  849.270358] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.270962] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.271530] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.272095] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.272662] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.273233] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.273810] nvme 0000:44:00.0: AER: aer_status: 0x00001001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.274386] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.274955] nvme 0000:44:00.0:    [12] Timeout               
May 19 10:29:49 pve kernel: [  849.275518] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Transmitter ID
May 19 10:29:49 pve kernel: [  849.276095] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.276677] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.277261] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.277852] nvme 0000:43:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.278443] nvme 0000:43:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.279039] nvme 0000:43:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.279657] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.280268] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.280877] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.281494] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.282112] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.282733] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.283426] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.284061] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.284697] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.285347] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.285999] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.286657] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.287320] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.287981] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.288636] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.289315] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.289986] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.290660] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.291347] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.292037] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.292726] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.293436] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.294138] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.294840] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.295555] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.296282] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:49 pve kernel: [  849.297006] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.297733] pcieport 0000:40:03.2: AER: aer_status: 0x00000080, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.298463] pcieport 0000:40:03.2:    [ 7] BadDLLP               
May 19 10:29:49 pve kernel: [  849.299193] pcieport 0000:40:03.2: AER: aer_layer=Data Link Layer, aer_agent=Receiver ID
May 19 10:29:49 pve kernel: [  849.299941] pcieport 0000:40:03.2: AER: aer_status: 0x00000080, aer_mask: 0x00000000
May 19 10:29:49 pve kernel: [  849.300686] pcieport 0000:40:03.2:    [ 7] BadDLLP               
May 19 10:29:49 pve kernel: [  849.301434] pcieport 0000:40:03.2: AER: aer_layer=Data Link Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.967304] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.968613] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.969389] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.970152] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.970919] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.971702] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.972499] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.973286] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.974069] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.974860] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.975650] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.976433] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.977226] nvme 0000:43:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.978018] nvme 0000:43:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.978807] nvme 0000:43:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.979617] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.980414] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.981210] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.982012] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.982808] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.983606] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.984406] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.985201] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.985988] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.986783] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.987576] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.988369] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.989164] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.989954] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.990740] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.991544] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.992360] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.993150] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.993951] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.994745] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.995542] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.996343] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.997137] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  853.997930] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:54 pve kernel: [  853.998735] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:54 pve kernel: [  853.999532] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:54 pve kernel: [  854.000362] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:55 pve kernel: [  854.991228] ghes_print_estatus: 1 callbacks suppressed
May 19 10:29:59 pve kernel: [  859.451587] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.452209] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.452791] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.453374] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.453955] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.454538] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.455138] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.455739] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.456340] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.456952] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.457560] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.458166] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.458779] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.459396] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.460013] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.460643] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.461272] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.461907] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.462551] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.463201] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.463848] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.464506] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.465157] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.465806] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.466468] nvme 0000:43:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.467135] nvme 0000:43:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.467802] nvme 0000:43:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.468480] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.469157] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.469837] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.470533] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.471230] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.471925] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.472627] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.473322] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.474020] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.474734] nvme 0000:44:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.475450] nvme 0000:44:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.476165] nvme 0000:44:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
May 19 10:29:59 pve kernel: [  859.476892] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
May 19 10:29:59 pve kernel: [  859.477619] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
May 19 10:29:59 pve kernel: [  859.478351] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
 
Bash:
May 19 10:30:00 pve kernel: [  860.366885] {306}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 512
May 19 10:30:00 pve kernel: [  860.368308] {306}[Hardware Error]: It has been corrected by h/w and requires no further action
May 19 10:30:00 pve kernel: [  860.369071] {306}[Hardware Error]: event severity: corrected
May 19 10:30:00 pve kernel: [  860.369816] {306}[Hardware Error]:  Error 0, type: corrected
May 19 10:30:00 pve kernel: [  860.370563] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.371305] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.372046] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.372782] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.373527] {306}[Hardware Error]:   device_id: 0000:40:03.3
May 19 10:30:00 pve kernel: [  860.374268] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.375006] {306}[Hardware Error]:   secondary_bus: 0x43
May 19 10:30:00 pve kernel: [  860.375743] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.376484] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.377218] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
May 19 10:30:00 pve kernel: [  860.377953] {306}[Hardware Error]:  Error 1, type: corrected
May 19 10:30:00 pve kernel: [  860.378682] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.379407] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.380124] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.380832] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.381539] {306}[Hardware Error]:   device_id: 0000:40:03.3
May 19 10:30:00 pve kernel: [  860.382246] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.382939] {306}[Hardware Error]:   secondary_bus: 0x43
May 19 10:30:00 pve kernel: [  860.383626] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.384315] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.384997] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
May 19 10:30:00 pve kernel: [  860.385686] {306}[Hardware Error]:  Error 2, type: corrected
May 19 10:30:00 pve kernel: [  860.386375] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.387053] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.387723] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.388385] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.389048] {306}[Hardware Error]:   device_id: 0000:40:03.3
May 19 10:30:00 pve kernel: [  860.389704] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.404696] {306}[Hardware Error]:   secondary_bus: 0x43
May 19 10:30:00 pve kernel: [  860.405340] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.405983] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.406670] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
May 19 10:30:00 pve kernel: [  860.407357] {306}[Hardware Error]:  Error 3, type: corrected
May 19 10:30:00 pve kernel: [  860.407994] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.408631] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.409299] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.409925] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.410589] {306}[Hardware Error]:   device_id: 0000:40:03.3
May 19 10:30:00 pve kernel: [  860.411200] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.411801] {306}[Hardware Error]:   secondary_bus: 0x43
May 19 10:30:00 pve kernel: [  860.412442] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.413040] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.413634] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
May 19 10:30:00 pve kernel: [  860.414233] {306}[Hardware Error]:  Error 4, type: corrected
May 19 10:30:00 pve kernel: [  860.414825] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.415455] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.416035] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.416610] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.417180] {306}[Hardware Error]:   device_id: 0000:40:03.2
May 19 10:30:00 pve kernel: [  860.417744] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.418303] {306}[Hardware Error]:   secondary_bus: 0x42
May 19 10:30:00 pve kernel: [  860.418852] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.419398] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.419929] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
May 19 10:30:00 pve kernel: [  860.420465] {306}[Hardware Error]:  Error 5, type: corrected
May 19 10:30:00 pve kernel: [  860.421003] {306}[Hardware Error]:   section_type: PCIe error
May 19 10:30:00 pve kernel: [  860.421584] {306}[Hardware Error]:   port_type: 4, root port
May 19 10:30:00 pve kernel: [  860.422121] {306}[Hardware Error]:   version: 0.2
May 19 10:30:00 pve kernel: [  860.422662] {306}[Hardware Error]:   command: 0x0407, status: 0x0010
May 19 10:30:00 pve kernel: [  860.423202] {306}[Hardware Error]:   device_id: 0000:40:03.3
May 19 10:30:00 pve kernel: [  860.423740] {306}[Hardware Error]:   slot: 39
May 19 10:30:00 pve kernel: [  860.424314] {306}[Hardware Error]:   secondary_bus: 0x43
May 19 10:30:00 pve kernel: [  860.424869] {306}[Hardware Error]:   vendor_id: 0x1022, device_id: 0x1483
May 19 10:30:00 pve kernel: [  860.425422] {306}[Hardware Error]:   class_code: 060400
May 19 10:30:00 pve kernel: [  860.425964] {306}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0012
 
If you read errors it says somewhere
"Hardware error from APEI Generic Hardware Error Source: 512
It has been corrected by h/w and requires no further action
event severity: corrected"

So are these just warning or some serious errors that can break things? If these are just warnings is there a way to turn them off since they spam logs. And if there is some serious issue, how to find out what is the issue exactly?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!