[SOLVED] Guest internal error when passing through PCIe

mozartlovecats

New Member
Nov 6, 2021
5
0
1
36
Hello,

I have PVE 7.0 running on a Gigabyte H310N motherboard with intel i5-8400. The IOMMU setup was done following the wiki.

The motherboard has an M.2 slot for wifi card, and I have a Coral TPU installed on it.

The IOMMU group looks good:

Code:
IOMMU Group 0:
    00:00.0 Host bridge [0600]: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers [8086:3ec2] (rev 07)
IOMMU Group 1:
    00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
    01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
    01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU Group 2:
    00:02.0 VGA compatible controller [0300]: Intel Corporation CometLake-S GT2 [UHD Graphics 630] [8086:3e92]
IOMMU Group 3:
    00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
IOMMU Group 4:
    00:14.0 USB controller [0c03]: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [8086:a2af]
IOMMU Group 5:
    00:16.0 Communication controller [0780]: Intel Corporation 200 Series PCH CSME HECI #1 [8086:a2ba]
IOMMU Group 6:
    00:17.0 SATA controller [0106]: Intel Corporation 200 Series PCH SATA controller [AHCI mode] [8086:a282]
IOMMU Group 7:
    00:1c.0 PCI bridge [0604]: Intel Corporation 200 Series PCH PCI Express Root Port #5 [8086:a294] (rev f0)
IOMMU Group 8:
    00:1d.0 PCI bridge [0604]: Intel Corporation 200 Series PCH PCI Express Root Port #11 [8086:a29a] (rev f0)
IOMMU Group 9:
    00:1d.3 PCI bridge [0604]: Intel Corporation 200 Series PCH PCI Express Root Port #12 [8086:a29b] (rev f0)
IOMMU Group 10:
    00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:a2ca]
    00:1f.2 Memory controller [0580]: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller [8086:a2a1]
    00:1f.3 Audio device [0403]: Intel Corporation 200 Series PCH HD Audio [8086:a2f0]
    00:1f.4 SMBus [0c05]: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller [8086:a2a3]
IOMMU Group 11:
    02:00.0 Non-Volatile memory controller [0108]: Silicon Motion, Inc. SM2263EN/SM2263XT SSD Controller [126f:2263] (rev 03)
IOMMU Group 12:
    03:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 16)
IOMMU Group 13:
    04:00.0 System peripheral [0880]: Global Unichip Corp. Coral Edge TPU [1ac1:089a]

The Coral TPU is located in Group 13. So I pass through PCI device 04:00.0 from Group 13 in the Proxmox GUI. Then I turn on the guest running Ubuntu 20.04 LTS.

But I get internal error on the guest, it won't boot.

Syslog shows:

Code:
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.3
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3:   device [8086:a29b] error status/mask=00100000/00010000
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3:    [20] UnsupReq               (First)
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3: AER:   TLP Header: 34000000 04000010 00000000 00000000
Nov 06 22:12:22 Proxmox kernel: pcieport 0000:00:1d.3: AER: device recovery successful
Nov 06 22:12:22 Proxmox kernel: vfio-pci 0000:04:00.0: vfio_ecap_init: hiding ecap 0x1e@0x110
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.3
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3:   device [8086:a29b] error status/mask=00100000/00010000
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3:    [20] UnsupReq               (First)
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3: AER:   TLP Header: 34000000 04000010 00000000 00000000
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3: AER: device recovery successful
Nov 06 22:12:23 Proxmox QEMU[7423]: kvm: vfio_err_notifier_handler(0000:04:00.0) Unrecoverable error detected. Please collect any data possible and then kill the guest

I suppose that 00.1d.3 is the PCI bus on this M.2 interface. But I don't understand why it failed.

The guest runs fine without PCI passthrough. It only gets the "internal error" when the PCI passthrough is set.

I have two VMs running with USB device passthrough. Not sure whether this is related but I guess it's not.
 
Did you blacklist the driver for the Coral TPU or bind it to the vfio-pci driver early to prevent the host from touching it?
Have you tried adding pcie_aspm=off or pci=noaer to the kernel parameters to work around this?
 
  • Like
Reactions: mozartlovecats
Did you blacklist the driver for the Coral TPU or bind it to the vfio-pci driver early to prevent the host from touching it?
Have you tried adding pcie_aspm=off or pci=noaer to the kernel parameters to work around this?
I didn't blacklist the driver but adding pcie_aspm=off worked! The Coral TPU shows up in the guest. Thank you!

ASPM seems like power management? I wouldn't even think about this. Why does it matter?
 
I just found on the internet that pcie_aspm is a possible (system-wide) work-around for such errors. I would not advise it as a good solution, but wondered whether it would (temporarily) help here.
Maybe the Coral TPU (whatever that is?) has problems with PCIe Active State Power Management? Maybe you need prevent the Proxmox host from initializing the device because it does not fully reset? I suggest looking into blacklisting and early binding to vfio-pci.
I find it weird that 00:1d.3 is not in the same group as 04:00.0, but i have no experience with your motherboard/platform. Did you use pcie_acs_override (which invalided any IOMMU grouping information)?
You did not share information about the VM, but sometimes reinstalling Windows inside the VM can resolve PCIe errors (which is also weird).
 
I just found on the internet that pcie_aspm is a possible (system-wide) work-around for such errors. I would not advise it as a good solution, but wondered whether it would (temporarily) help here.
Maybe the Coral TPU (whatever that is?) has problems with PCIe Active State Power Management? Maybe you need prevent the Proxmox host from initializing the device because it does not fully reset? I suggest looking into blacklisting and early binding to vfio-pci.
I find it weird that 00:1d.3 is not in the same group as 04:00.0, but i have no experience with your motherboard/platform. Did you use pcie_acs_override (which invalided any IOMMU grouping information)?
You did not share information about the VM, but sometimes reinstalling Windows inside the VM can resolve PCIe errors (which is also weird).
I also found some reset related issue on internet with similar error, most of them are about AMD GPU, but the symptom looks similar. I'll take a look into blacklisting it and early binding to vfio-pci, this would probably be a more robust approach.

I didn't use pcie_acs_override. The only thing I did was adding the
Code:
intel_iommu=on, iommu=pt
to kernel parameter, and also added
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
to the modules, that's it. I think the IOMMU groups of this motherboard looks good, every PCIe device is isolated.

The setup is very simple. It will work if I put a Nvidia GPU in the X16 slot and pass through.

The VM has Ubuntu 20.04 LTS installed. But probably it doesn't matter because the VM didn't even manage to POST.

Coral TPU is an AI compute module made by Google, I'm going to use it for object detection in the VM.
 
If you use Linux inside, you might want to consider using a (unpriviledged) container instead of a VM. It does not pin all memory into RAM and you can passthrough any device via the /dev node. But it does need drivers installed on the Proxmox (Debian) host, which need to keep up with the Proxmox (Ubuntu-based) kernel..
 
I also found some reset related issue on internet with similar error, most of them are about AMD GPU, but the symptom looks similar. I'll take a look into blacklisting it and early binding to vfio-pci, this would probably be a more robust approach.

I didn't use pcie_acs_override. The only thing I did was adding the
Code:
intel_iommu=on, iommu=pt
to kernel parameter, and also added
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
to the modules, that's it. I think the IOMMU groups of this motherboard looks good, every PCIe device is isolated.

The setup is very simple. It will work if I put a Nvidia GPU in the X16 slot and pass through.

The VM has Ubuntu 20.04 LTS installed. But probably it doesn't matter because the VM didn't even manage to POST.

Coral TPU is an AI compute module made by Google, I'm going to use it for object detection in the VM.
I was having the same issue even though I had blacklisted the driver and tried early binding. The only thing that worked was turning off the pci power management with pcie_aspm=off. Did you end up just leaving this turned off or did you find a better solution?

Cheers
 
I was having the same issue even though I had blacklisted the driver and tried early binding. The only thing that worked was turning off the pci power management with pcie_aspm=off. Did you end up just leaving this turned off or did you find a better solution?

Cheers
I did end up leaving it like this. In my case it was a PCIe card with less then 5W, so I think it's not going to do any harm.
 
I did end up leaving it like this. In my case it was a PCIe card with less then 5W, so I think it's not going to do any harm.
I think pcie_aspm=off applies to all PCIe devices on the system, so it might have more impact. Then again, specific drivers for devices might do most of the power-management, regardless of this setting for the PCIe bus.
 
I think pcie_aspm=off applies to all PCIe devices on the system, so it might have more impact. Then again, specific drivers for devices might do most of the power-management, regardless of this setting for the PCIe bus.
I think you are right, this setting applies to all PCIe devices.
I don't know how power management works in the OS, but it feels like if the PCIe device is passed through into a VM, the VM should be able to take control, otherwise there will still be a layer of virtualization of translation. I could be totally wrong though.
 
I think you are right, this setting applies to all PCIe devices.
I don't know how power management works in the OS, but it feels like if the PCIe device is passed through into a VM, the VM should be able to take control, otherwise there will still be a layer of virtualization of translation. I could be totally wrong though.
Thanks for the reply. Yeah my Coral seems to be functioning well and passed through your HA and frigate. Glad to hear that it should work well in the longer term. Cheers
 
I just found on the internet that pcie_aspm is a possible (system-wide) work-around for such errors. I would not advise it as a good solution, but wondered whether it would (temporarily) help here.
I appear to be getting the exact same errors in syslog, even though my situation is a little different. I have a PCIe card (H310) that is disappearing (after some time) from the guest it is passed through to. Once it disappears, I see the same errors (copied from above):
Code:
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3:   device [8086:a29b] error status/mask=00100000/00010000
Nov 06 22:12:23 Proxmox kernel: pcieport 0000:00:1d.3:    [20] UnsupReq               (First)
I have tried it with two different H310 cards thinking the issue was that this rather old HBA was finally nearing its end, but the problem quickly occurred with another one as well. I am going to add the pcie_aspm=off setting to /etc/default/grub and see what happens.
The strange thing is that this HBA worked flawlessly for years, but I'm thinking this problem may have arose after running an update in the recent months... I'm really not sure. I suppose it could be an issue with the backplane the card is connected to... yikes. I'll report back.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!