RTX 5070 Ti GPU passthrough fails in Proxmox – stuck in D3 / FLR reset loop → device becomes unresponsive (ffff)

sfesenko003

New Member
May 18, 2026
1
0
1
Hi everyone, I’m trying to get GPU passthrough working in Proxmox VE, but I’m hitting a consistent reset failure / power state issue with my NVIDIA RTX 5070 Ti.
I’ve done extensive troubleshooting and would really appreciate help confirming whether this is a known limitation, firmware bug, or if there’s anything else I can try.

System details​

  • GPU: NVIDIA RTX 5070 Ti (GB203)
  • Motherboard: MSI MAG B860 Tomahawk
  • CPU: (Intel, VT-d enabled)
  • Proxmox: 9.1.1
  • Kernel: 6.17.2-1-pve
  • QEMU: 10.1.2
  • VM: Windows 11 (OVMF / Q35)

Current VM config (relevant parts)
hostpci0: 0000:02:00.0,pcie=1,x-vga=1
hostpci1: 0000:02:00.1,pcie=1
cpu: host (with kvm=off set via QEMU)

After a cold boot, the GPU is healthy: setpci -s 0000:02:00.0 0x00.w 0x02.w returns 10de 2c05
On a fresh Windows install, I once saw the NVIDIA GPU in Device Manager

As soon as I start the VM
1. GPU goes into D3 / low power state
2. VFIO attempts reset
3. GPU fails to recover
4. Eventually becomes unresponsive: setpci → ffff ffff

Only recovery is full power cycle (PSU OFF)

dmesg logs (key errors)
pcieport 0000:00:06.0: Data Link Layer Link Active not set in 100 msec
vfio-pci 0000:02:00.0: timed out waiting for pending transaction; performing function level reset anyway
vfio-pci 0000:02:00.0: not ready 1023ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 2047ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 4095ms after FLR; waiting

The GPU exposes only 2 reset methods: cat /sys/bus/pci/devices/0000:02:00.0/reset_method -> flr bus

Reset typeResult
FLR hangs ("not ready after FLR")
Bus reset (bridge)GPU becomes unresponsive (ffff)


Attempting echo none > reset_method fails Invalid reset method 'none'

PCIe link status
lspci -vv -s 00:06.0
lspci -vv -s 02:00.0
  • Root port max: Gen3 x16 (8GT/s)
  • GPU capable: Gen5
  • Running: Gen3 x16 (downgraded) Link appears stable when idle.

BIOS:

VT-d enabled

Above 4G decoding enabled

ASPM disabled

PCIe forced to Gen3

Secure Boot disabled (kernel lockdown = none)


Kernel parameters:
intel_iommu=on iommu=pt
pcie_aspm=off
pcie_port_pm=off
vfio-pci.disable_idle_d3=1

VFIO:
  • Devices bound correctly
  • Verified via lspci -nnk

Reset attempts:​

  • Secondary bus reset → breaks GPU
  • FLR → hangs
  • reset_method=none → not supported

Other attempts:​

  • Different VM configs
  • Fresh Windows install
  • NVIDIA drivers (fail to initialize GPU)
  • Verified GPU visible only once (after reinstall)

This appears to be : A broken GPU reset path (FLR) combined with unsafe bus reset on this root port (00:06.0), which leads to:
  • VFIO forcing FLR → failure
  • Bus reset → hardware becomes inaccessible
  • No valid reset fallback available
1. Is this a known issue with RTX 50-series GPUs in VFIO passthrough?
2. Is there any way to disable FLR in vfio-pci or QEMU in newer kernels?
3. Could this be:
  • BIOS / PCIe firmware issue?
  • Root port (00:06.0) limitations
  • Are there known workarounds besides:
    • different motherboard / slot
    • or avoiding passthrough entirely?
I would really appreciate any help at this point.
 
Got the same issue on latest proxmox VE 9.2.3, Kernel 6.17 or 7.0.6, intel machine and same ReBAR/Above 4G settings in BIOS as you

I've gone the full circle on trying to get the card to work, blacklisting modules, grub settings and vfio parameters. Also installed windows natively ion the host to upgrade card's GOP but to no avail

The behavior is the same as you see: card doesn't do anything or mess up anything until I passed it to a VM AND I install nvidia drivers there . As soon as nvidia drivers are installed the bus reset in the host and/or FLR hangs the GPU and then the host just freeze and the same happens whenever I start the VM

I tried echo "" into reset_method (I believe "none" does not work?) but I just get vfio BAR issues and then system hang anyway

If you ever make passthrough working please let me know, I am no fan of using card in LXC as I have gone down that route with networking and docker before and it always felt like a patch up solution