ASUS X870E PCIe x8 Bifurcation Bug with RTX 5090 VFIO Passthrough

refocus_mutt820

New Member
Nov 11, 2025
4
0
1
Hi all,

I used AI to summarize what I have tried, the problem, etc...

Problem Statement​


Issue: NVIDIA RTX 5090 GPU negotiates PCIe x16 on cold boot but downgrades to x8 after Function Level Reset (FLR) during VFIO passthrough to a VM. The ASUS X870E motherboard's PCIe controller incorrectly interprets the FLR as link instability and permanently bifurcates the slot to x8 until the next cold boot.


Impact: 50% PCIe bandwidth loss in VM (x8 instead of x16)

System Specifications​


  • Motherboard: ASUS X870E (AM5, Ryzen 9000 Series)
  • CPU: AMD Ryzen 9 9950X3D
  • GPU: NVIDIA GeForce RTX 5090 (Device ID: 10de:2b85)
  • Host OS: Proxmox VE 8.x
  • Kernel: 6.17.4-1-pve
  • PCIe Slot: PCIEX16_1 (CPU-attached, Gen 5 capable)

System Information for Reference
Code:
# Kernel version
$ uname -r
6.17.4-1-pve

# GPU PCIe info
$ lspci -nn -s 01:00.0
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2b85] (rev a1)

# IOMMU groups
$ find /sys/kernel/iommu_groups/ -type l | grep 01:00
/sys/kernel/iommu_groups/14/devices/0000:01:00.0
/sys/kernel/iommu_groups/14/devices/0000:01:00.1

# Current vfio bindings
$ lspci -k -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2b85 (rev a1)
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

# Secure Boot status
$ mokutil --sb-state
SecureBoot enabled (causing setpci lockdown)

BIOS Settings​

Code:
PCIEX16_1 Link Speed: Gen 4 (Forced - testing showed Gen 5 has same issue)
NATIVE ASPM: Disabled
CPU PCIE ASPM MODE CONTROL: Auto(Tried Disabled)
Clock Spread Spectrum: Auto(Tried Disabled)
SR-IOV: Enabled
IOMMU: Enabled
PCIE16x1: Forced x16 (to no effect)
M.2_2 / M.2_3: Gen 1 / Disabled (where possible)


Diagnostic Evidence​


Link State Behavior​


Cold Boot (Host):
Code:
$ lspci -s 01:00.0 -vv | grep -P "LnkCap:|LnkSta:"
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 unlimited
LnkSta: Speed 32GT/s, Width x16  ← CORRECT


After Windows VM Start:
Code:
$ lspci -s 01:00.0 -vv | grep -P "LnkCap:|LnkSta:"
LnkCap: Port #0, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <4us
LnkSta: Speed 32GT/s, Width x8 (downgraded)  ← FAILED
```

**Key Observation:** Exit Latency changes from `unlimited` → `<4us`, indicating the PCIe controller transitioned through ASPM L1 state during the reset.

### Kernel Logs During Reset
```
[   48.247349] vfio-pci 0000:01:00.0: enabling device (0002 -> 0003)
[   48.247435] vfio-pci 0000:01:00.0: resetting
[   48.348435] vfio-pci 0000:01:00.0: reset done  ← 100ms reset, clean


The reset itself completes successfully, but link re-negotiation results in x8.


Code:
A. Cold Boot vs. Hot Reset Behavior



Host Boot (lspci): LnkSta: Speed 32GT/s (or 16GT/s), Width x16 -> CORRECT



VM Start (gpu-z): Bus Interface: PCIe x16 4.0 @ x8 4.0 -> FAILED



B. Troubleshooting Performed (All Failed)



VBIOS: Dumped clean VBIOS (romfile=rtx5090.bin) to bypass ROM reading latency. Result: Still x8.



Kernel Flags:



Tried pcie_acs_override=downstream (Host crashed due to IOMMU grouping).



Forced pcie_acs_override=downstream,multifunction (Stable, but causes x8 split).



Signal Integrity:



Forced Gen 3 in BIOS. Result: Still dropped to x8. (Proves this is logic/firmware, not signal noise).



Disabled Spread Spectrum. Result: No change.



Windows Driver:



Disabled/Re-enabled device in Device Manager. Result: Stuck at x8.



Changed Power Management to "Off". Result: Stuck at x8.



Troubleshooting Attempts (All Failed to Prevent x8)​


1. Signal Integrity Testing​


  • Forced Gen 3 in BIOS: Still dropped to x8 (proves not signal noise)
  • Disabled Spread Spectrum: No change
  • Result: Rules out physical layer issues

2. VBIOS ROM Handling​


  • Dumped clean VBIOS to romfile=rtx5090.rom to bypass ROM read latency
  • Result: Still x8

3. Kernel Parameters Tested
Code:
# Tried individually and in combination:
pcie_acs_override=downstream,multifunction  # Caused x8 split
pcie_aspm=off                                # Already in use
pci=nocrs,noaer                             # No effect
pcie_port_pm=off                            # No effect
pci=pcie_bus_perf                           # Not yet tested

4. PCIe Configuration Locking Scripts​


Code:
setpci -s 01:00.0 CAP_EXP+0x10.W=0x0000  # Disable ASPM
setpci -s 01:00.0 CAP_PM+0x04.W=0x0000   # Force D0 state
setpci -s 01:00.0 CAP_EXP+0x30.W=0x0010  # Lock target width x16



Questions for the Community​


  1. Has anyone successfully worked around PCIe bifurcation on ASUS X870E boards with Gen 5 GPUs?
  2. Is there a way to completely disable vfio-pci device resets?We've tried:
    • Module parameters: disable_vga=1 disable_idle_d3=1 nointxmask=1
    • Module wrapper scripts
    • Kernel parameters to disable PCIe reset mechanisms
  3. Can QEMU be configured to skip device resets entirely during passthrough? Are there hostpci parameters or machine-type options that prevent FLR?
  4. Secure Boot Lockdown: Our system is in lockdown mode (evidenced by "Operation not permitted" on setpci). Is there a way to disable PCIe config space protection while maintaining Secure Boot?
 
I found a fix but it requires me to not pass the audio portion of the GPU and just the video feed of the GPU only to maintain the x16 width.

Ie:
hostpci0: 0000:01:00.0,pcie=1,x-vga=1 (keep)
hostpci1: 0000:01:00.1,pcie=1 (remove)

Does anyone know how to keep both audio and video?
 
Last edited:
I found a fix but it requires me to not pass the audio portion of the GPU and just the video feed of the GPU only to maintain the x16 width.

Ie:
hostpci0: 0000:01:00.0,pcie=1,x-vga=1 (keep)
hostpci1: 0000:01:00.1,pcie=1 (remove)

Does anyone know how to keep both audio and video?
I’m also having this issue, with same setup 9950x3d, 5090, x870e. It’s causing my gpu to die under stress, did you find a fix?

Edit: removing my vertical gpu mount fixed the issue, seems it degraded
 
Last edited: