Problems with PCIe passthrough with two identical devices

Leseratte10

New Member
Jun 15, 2024
1
0
1
I have a machine with two identical SATA controllers and I'd like to passthrough both of them to the same VM.

The two controllers are in their own IOMMU group each, with no other device in the same group:
IOMMU group 21 5a:00.0 SATA controller [0106]: ASMedia Technology Inc. Device [1b21:1164] (rev 02)
IOMMU group 22 5b:00.0 SATA controller [0106]: ASMedia Technology Inc. Device [1b21:1164] (rev 02)

lspci -vv lists "Kernel driver in use: vfio-pci" for both, since I have "options vfio-pci ids=1b21:1164" and "softdep ahci pre: vfio-pci" in the modprobe.d config folder, so the correct driver is being loaded for both controllers. The Proxmox root drive is (obviously) not connected to either of these controllers.

I tried the "All Functions", the "ROM-Bar" and the "PCI-Express" toggle checkbox in various different states, but no matter what I do, it doesn't work.

I can passthrough the 0000:5a:00.0 device just fine - I enable it, boot the VM, and the VM sees the device. However, when I try to add the 0000:5b:00.0 device as well (or even when I remove the 5a passthrough, reboot the system and just try to passthrough the 5b device on its own), everything breaks:

Code:
[   83.531263] pcieport 0000:00:1d.2: broken device, retraining non-functional downstream link at 2.5GT/s
[   84.533259] pcieport 0000:00:1d.2: retraining failed
[   85.703265] pcieport 0000:00:1d.2: broken device, retraining non-functional downstream link at 2.5GT/s
[   86.705260] pcieport 0000:00:1d.2: retraining failed
[   86.705268] vfio-pci 0000:5b:00.0: not ready 1023ms after bus reset; waiting
[   87.751268] vfio-pci 0000:5b:00.0: not ready 2047ms after bus reset; waiting
[   89.863257] vfio-pci 0000:5b:00.0: not ready 4095ms after bus reset; waiting
[   94.023267] vfio-pci 0000:5b:00.0: not ready 8191ms after bus reset; waiting
[  102.727253] vfio-pci 0000:5b:00.0: not ready 16383ms after bus reset; waiting
[  119.623476] vfio-pci 0000:5b:00.0: not ready 32767ms after bus reset; waiting
[  152.904185] vfio-pci 0000:5b:00.0: not ready 65535ms after bus reset; giving up
[  153.996178] pcieport 0000:00:1d.2: broken device, retraining non-functional downstream link at 2.5GT/s
[  154.997196] pcieport 0000:00:1d.2: retraining failed
[  156.168222] pcieport 0000:00:1d.2: broken device, retraining non-functional downstream link at 2.5GT/s
[  157.170237] pcieport 0000:00:1d.2: retraining failed
[  157.170244] vfio-pci 0000:5b:00.0: not ready 1023ms after bus reset; waiting
[  158.216263] vfio-pci 0000:5b:00.0: not ready 2047ms after bus reset; waiting
[  160.328295] vfio-pci 0000:5b:00.0: not ready 4095ms after bus reset; waiting
[  164.680308] vfio-pci 0000:5b:00.0: not ready 8191ms after bus reset; waiting
[  173.384298] vfio-pci 0000:5b:00.0: not ready 16383ms after bus reset; waiting
[  190.280283] vfio-pci 0000:5b:00.0: not ready 32767ms after bus reset; waiting

My grub command line is "intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction".

The output of "lspci -vv" for the two controllers is, with the exception of IOMMU group and RAM region addresses, absolutely identical. The device 0000:00:1d:2 it complains about is the PCI bridge:

IOMMU group 14 00:1d.2 PCI bridge [0604]: Intel Corporation Device [8086:51b2] (rev 01)

Code:
5b:00.0 SATA controller: ASMedia Technology Inc. Device 1164 (rev 02) (prog-if 01 [AHCI 1.0])
    Subsystem: ASMedia Technology Inc. Device 2116
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 255
    IOMMU group: 22
    Region 0: Memory at 85882000 (32-bit, non-prefetchable) [size=8K]
    Region 5: Memory at 85880000 (32-bit, non-prefetchable) [size=8K]
    Expansion ROM at 85800000 [disabled] [size=512K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
        Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
    Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [80] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 25W
        DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 256 bytes, MaxReadReq 256 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 8GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 8GT/s, Width x2
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
             10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis+
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
             EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
            MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [130 v1] Secondary PCI Express
        LnkCtl3: LnkEquIntrruptEn- PerformEqu-
        LaneErrStat: 0
    Kernel driver in use: vfio-pci
    Kernel modules: ahci

I tried changing the reset method to "bus" by writing to /sys/bus/pci/devices/xxx/reset_method, but that didn't change anything either. (By default the file just contains "pm bus", no "flr" that I've seen in other posts). Not that I'd expected it to, since the other controller is working just fine ...

Is there anything I'm missing? How can I get this second SATA controller also passed-through to a VM? If it was a general incompatibility with the controller why would it work with one of them?

EDIT: I can reproduce the reset issue manually as well.

Code:
echo 1 > /sys/bus/pci/drivers/vfio-pci/0000\:5a\:00.0/reset
echo 1 > /sys/bus/pci/drivers/vfio-pci/0000\:5b\:00.0/reset

First command for the 5a device works and returns immediately, second command for the 5b device hangs for a minute then shows "Inappropriate ioctl for device".

EDIT 2: It looks like other people with the exact same device (DXP8800) have the same issue with the 2nd controller ... any BIOS settings I should look for? The one controller works just fine ..
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!