ASUS X870E PCIe x8 Bifurcation Bug with RTX 5090 VFIO Passthrough

refocus_mutt820

New Member
Nov 11, 2025
2
0
1
Hi all,

I used AI to summarize what I have tried, the problem, etc...

Problem Statement​


Issue: NVIDIA RTX 5090 GPU negotiates PCIe x16 on cold boot but downgrades to x8 after Function Level Reset (FLR) during VFIO passthrough to a VM. The ASUS X870E motherboard's PCIe controller incorrectly interprets the FLR as link instability and permanently bifurcates the slot to x8 until the next cold boot.


Impact: 50% PCIe bandwidth loss in VM (x8 instead of x16)

System Specifications​


  • Motherboard: ASUS X870E (AM5, Ryzen 9000 Series)
  • CPU: AMD Ryzen 9 9950X3D
  • GPU: NVIDIA GeForce RTX 5090 (Device ID: 10de:2b85)
  • Host OS: Proxmox VE 8.x
  • Kernel: 6.17.4-1-pve
  • PCIe Slot: PCIEX16_1 (CPU-attached, Gen 5 capable)

System Information for Reference
Code:
# Kernel version
$ uname -r
6.17.4-1-pve

# GPU PCIe info
$ lspci -nn -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)

# IOMMU groups
$ find /sys/kernel/iommu_groups/ -type l | grep 01:00
/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/14/devices/0000:01:00.1

# Current vfio bindings
$ lspci -k -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2b85 (rev a1)
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

# Secure Boot status
$ mokutil --sb-state
SecureBoot enabled (causing setpci lockdown)

BIOS Settings​

Code:
PCIEX16_1 Link Speed: Gen 4 (Forced - testing showed Gen 5 has same issue)
NATIVE ASPM: Disabled
CPU PCIE ASPM MODE CONTROL: Auto(Tried Disabled)
Clock Spread Spectrum: Auto(Tried Disabled)
SR-IOV: Enabled
IOMMU: Enabled
PCIE16x1: Forced x16 (to no effect)
M.2_2 / M.2_3: Gen 1 / Disabled (where possible)


Diagnostic Evidence​


Link State Behavior​


Cold Boot (Host):
Code:
$ lspci -s 01:00.0 -vv | grep -P "LnkCap:|LnkSta:"
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 unlimited
LnkSta: Speed 32GT/s, Width x16  ← CORRECT


After Windows VM Start:
Code:
$ lspci -s 01:00.0 -vv | grep -P "LnkCap:|LnkSta:"
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
LnkSta: Speed 32GT/s, Width x8 (downgraded)  ← FAILED
```

**Key Observation:** Exit Latency changes from `unlimited` → `<4us`, indicating the PCIe controller transitioned through ASPM L1 state during the reset.

### Kernel Logs During Reset
```
[   48.247349] vfio-pci 0000:01:00.0: enabling device (0002 -> 0003)
[   48.247435] vfio-pci 0000:01:00.0: resetting
[   48.348435] vfio-pci 0000:01:00.0: reset done  ← 100ms reset, clean


The reset itself completes successfully, but link re-negotiation results in x8.


Code:
A. Cold Boot vs. Hot Reset Behavior



Host Boot (lspci): LnkSta: Speed 32GT/s (or 16GT/s), Width x16 -> CORRECT



VM Start (gpu-z): Bus Interface: PCIe x16 4.0 @ x8 4.0 -> FAILED



B. Troubleshooting Performed (All Failed)



VBIOS: Dumped clean VBIOS (romfile=rtx5090.bin) to bypass ROM reading latency. Result: Still x8.



Kernel Flags:



Tried pcie_acs_override=downstream (Host crashed due to IOMMU grouping).



Forced pcie_acs_override=downstream,multifunction (Stable, but causes x8 split).



Signal Integrity:



Forced Gen 3 in BIOS. Result: Still dropped to x8. (Proves this is logic/firmware, not signal noise).



Disabled Spread Spectrum. Result: No change.



Windows Driver:



Disabled/Re-enabled device in Device Manager. Result: Stuck at x8.



Changed Power Management to "Off". Result: Stuck at x8.



Troubleshooting Attempts (All Failed to Prevent x8)​


1. Signal Integrity Testing​


  • Forced Gen 3 in BIOS: Still dropped to x8 (proves not signal noise)
  • Disabled Spread Spectrum: No change
  • Result: Rules out physical layer issues

2. VBIOS ROM Handling​


  • Dumped clean VBIOS to romfile=rtx5090.rom to bypass ROM read latency
  • Result: Still x8

3. Kernel Parameters Tested
Code:
# Tried individually and in combination:
pcie_acs_override=downstream,multifunction  # Caused x8 split
pcie_aspm=off                                # Already in use
pci=nocrs,noaer                             # No effect
pcie_port_pm=off                            # No effect
pci=pcie_bus_perf                           # Not yet tested

4. PCIe Configuration Locking Scripts​


Code:
setpci -s 01:00.0 CAP_EXP+0x10.W=0x0000  # Disable ASPM
setpci -s 01:00.0 CAP_PM+0x04.W=0x0000   # Force D0 state
setpci -s 01:00.0 CAP_EXP+0x30.W=0x0010  # Lock target width x16



Questions for the Community​


  1. Has anyone successfully worked around PCIe bifurcation on ASUS X870E boards with Gen 5 GPUs?
  2. Is there a way to completely disable vfio-pci device resets?We've tried:
    • Module parameters: disable_vga=1 disable_idle_d3=1 nointxmask=1
    • Module wrapper scripts
    • Kernel parameters to disable PCIe reset mechanisms
  3. Can QEMU be configured to skip device resets entirely during passthrough? Are there hostpci parameters or machine-type options that prevent FLR?
  4. Secure Boot Lockdown: Our system is in lockdown mode (evidenced by "Operation not permitted" on setpci). Is there a way to disable PCIe config space protection while maintaining Secure Boot?