[SOLVED] HBA Passthough not working after motherboard swap

Rawb

New Member
Jun 19, 2024
1
0
1
Hello,

I have a proxmox machine with a TrueNAS core VM to which I was passing through (raw device) a HBA card with all of my hard drives and everything was working perfectly fine, I had my drives and ZFS pools.

I just swapped out my motherboard from an Asus H510M-A Prime to a Gigabyte B560M DS3H V3 and I can't get the device passthrough to work again (proxmox and all of my other LXC are working fine)
  • VT-d is enabled in the new BIOS
  • I changed the device ID to match the new ID with the new motherboard/PCIE slot
When trying to start the proxmox VM with the passthrough enabled it hangs indefinitely but starts just fine without it

dsmeg output
Code:
[ 3961.638350] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
[ 3961.639195] pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00200000/00010000
[ 3961.640028] pcieport 0000:00:01.0:    [21] ACSViol                (First)
[ 3962.964161] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ 3963.966168] pcieport 0000:00:01.0: retraining failed
[ 3965.141182] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ 3966.143179] pcieport 0000:00:01.0: retraining failed
[ 3966.144005] vfio-pci 0000:02:00.0: not ready 1023ms after resume; waiting
[ 3967.189192] vfio-pci 0000:02:00.0: not ready 2047ms after resume; waiting
[ 3969.301191] vfio-pci 0000:02:00.0: not ready 4095ms after resume; waiting
[ 3973.909218] vfio-pci 0000:02:00.0: not ready 8191ms after resume; waiting
[ 3982.613247] vfio-pci 0000:02:00.0: not ready 16383ms after resume; waiting
[ 3999.509319] vfio-pci 0000:02:00.0: not ready 32767ms after resume; waiting
[ 4035.349495] vfio-pci 0000:02:00.0: not ready 65535ms after resume; giving up
[ 4035.350340] vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 4037.653485] vfio-pci 0000:02:00.0: not ready 1023ms after DPC; waiting
[ 4038.741363] vfio-pci 0000:02:00.0: not ready 2047ms after DPC; waiting
[ 4040.853375] vfio-pci 0000:02:00.0: not ready 4095ms after DPC; waiting
[ 4045.077517] vfio-pci 0000:02:00.0: not ready 8191ms after DPC; waiting
[ 4053.781494] vfio-pci 0000:02:00.0: not ready 16383ms after DPC; waiting
[ 4070.677628] vfio-pci 0000:02:00.0: not ready 32767ms after DPC; waiting
[ 4104.981770] vfio-pci 0000:02:00.0: not ready 65535ms after DPC; giving up
[ 4104.982576] pcieport 0000:00:01.0: AER: subordinate device reset failed
[ 4104.983401] pcieport 0000:00:01.0: AER: device recovery failed
[ 4104.984287] vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 4104.997337] pcieport 0000:00:01.0: DPC: containment event, status:0x1f11 source:0x0000
[ 4104.998180] pcieport 0000:00:01.0: DPC: unmasked uncorrectable error detected
[ 4104.999015] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
[ 4104.999890] pcieport 0000:00:01.0:   device [8086:4c01] error status/mask=00200000/00010000
[ 4105.000744] pcieport 0000:00:01.0:    [21] ACSViol                (First)
[ 4106.324652] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ 4107.326775] pcieport 0000:00:01.0: retraining failed
[ 4108.501787] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ 4109.504791] pcieport 0000:00:01.0: retraining failed
[ 4109.505618] vfio-pci 0000:02:00.0: not ready 1023ms after resume; waiting
[ 4110.549799] vfio-pci 0000:02:00.0: not ready 2047ms after resume; waiting
[ 4112.661803] vfio-pci 0000:02:00.0: not ready 4095ms after resume; waiting
[ 4117.269708] vfio-pci 0000:02:00.0: not ready 8191ms after resume; waiting

iommu device list
Code:
Group 0:        [8086:4c8b] [R] 00:02.0  VGA compatible controller                RocketLake-S GT1 [UHD Graphics 730]
Group 1:        [8086:4c53]     00:00.0  PCI bridge                               Device 4c53
Group 2:        [8086:4c01] [R] 00:01.0  PCI bridge                               Device 4c01
Group 3:        [8086:06f9]     00:12.0  Signal processing controller             Comet Lake PCH Thermal Controller
Group 4:        [8086:06ed]     00:14.0  USB controller                           Comet Lake USB 3.1 xHCI Host Controller
USB:            [048d:5702]              Bus 001 Device 004                       Integrated Technology Express, Inc. RGB LED Controller
USB:            [1d6b:0002]              Bus 001 Device 001                       Linux Foundation 2.0 root hub
USB:            [1d6b:0003]              Bus 002 Device 001                       Linux Foundation 3.0 root hub
                [8086:06ef]     00:14.2  RAM memory                               Comet Lake PCH Shared SRAM
Group 5:        [8086:06e0]     00:16.0  Communication controller                 Comet Lake HECI Controller
Group 6:        [8086:06d2]     00:17.0  SATA controller                          Comet Lake SATA AHCI Controller
Group 7:        [8086:06c2] [R] 00:1b.0  PCI bridge                               Device 06c2
Group 8:        [8086:06ac] [R] 00:1b.4  PCI bridge                               Comet Lake PCI Express Root Port #21
Group 9:        [8086:06bc] [R] 00:1c.0  PCI bridge                               Device 06bc
Group 10:       [8086:06b0] [R] 00:1d.0  PCI bridge                               Comet Lake PCI Express Root Port #9
                [8086:06b4] [R] 00:1d.4  PCI bridge                               Device 06b4
Group 11:       [8086:0684]     00:1f.0  ISA bridge                               H470 Chipset LPC/eSPI Controller
                [8086:f1c8]     00:1f.3  Audio device                             Device f1c8
                [8086:06a3]     00:1f.4  SMBus                                    Comet Lake PCH SMBus Controller
                [8086:06a4]     00:1f.5  Serial bus controller                    Comet Lake PCH SPI Controller
                [8086:0d4d]     00:1f.6  Ethernet controller                      Ethernet Connection (11) I219-V
Group 12:       [1000:0072] [R] 02:00.0  RAID bus controller                      SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
Group 13:       [10ec:8125] [R] 03:00.0  Ethernet controller                      RTL8125 2.5GbE Controller
Group 14:       [144d:a80a] [R] 04:00.0  Non-Volatile memory controller           NVMe SSD Controller PM9A1/PM9A3/980PRO
Group 15:       [144d:a80a] [R] 05:00.0  Non-Volatile memory controller           NVMe SSD Controller PM9A1/PM9A3/980PRO



EDIT: After switching the HBA card from the top PCIE x16 slot to the bottom one, and removing my two other PCIE devices I got it working just fine, I'm going to investigate further and close this thread for now
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!