Hi everyone, I’m trying to get GPU passthrough working in Proxmox VE, but I’m hitting a consistent reset failure / power state issue with my NVIDIA RTX 5070 Ti.
I’ve done extensive troubleshooting and would really appreciate help confirming whether this is a known limitation, firmware bug, or if there’s anything else I can try.
Current VM config (relevant parts)
hostpci0: 0000:02:00.0,pcie=1,x-vga=1
hostpci1: 0000:02:00.1,pcie=1
cpu: host (with kvm=off set via QEMU)
After a cold boot, the GPU is healthy: setpci -s 0000:02:00.0 0x00.w 0x02.w returns 10de 2c05
On a fresh Windows install, I once saw the NVIDIA GPU in Device Manager
As soon as I start the VM
1. GPU goes into D3 / low power state
2. VFIO attempts reset
3. GPU fails to recover
4. Eventually becomes unresponsive: setpci → ffff ffff
Only recovery is full power cycle (PSU OFF)
dmesg logs (key errors)
pcieport 0000:00:06.0: Data Link Layer Link Active not set in 100 msec
vfio-pci 0000:02:00.0: timed out waiting for pending transaction; performing function level reset anyway
vfio-pci 0000:02:00.0: not ready 1023ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 2047ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 4095ms after FLR; waiting
The GPU exposes only 2 reset methods: cat /sys/bus/pci/devices/0000:02:00.0/reset_method -> flr bus
Attempting echo none > reset_method fails Invalid reset method 'none'
PCIe link status
lspci -vv -s 00:06.0
lspci -vv -s 02:00.0
Kernel parameters:
intel_iommu=on iommu=pt
pcie_aspm=off
pcie_port_pm=off
vfio-pci.disable_idle_d3=1
VFIO:
This appears to be : A broken GPU reset path (FLR) combined with unsafe bus reset on this root port (00:06.0), which leads to:
2. Is there any way to disable FLR in vfio-pci or QEMU in newer kernels?
3. Could this be:
I’ve done extensive troubleshooting and would really appreciate help confirming whether this is a known limitation, firmware bug, or if there’s anything else I can try.
System details
- GPU: NVIDIA RTX 5070 Ti (GB203)
- Motherboard: MSI MAG B860 Tomahawk
- CPU: (Intel, VT-d enabled)
- Proxmox: 9.1.1
- Kernel: 6.17.2-1-pve
- QEMU: 10.1.2
- VM: Windows 11 (OVMF / Q35)
Current VM config (relevant parts)
hostpci0: 0000:02:00.0,pcie=1,x-vga=1
hostpci1: 0000:02:00.1,pcie=1
cpu: host (with kvm=off set via QEMU)
After a cold boot, the GPU is healthy: setpci -s 0000:02:00.0 0x00.w 0x02.w returns 10de 2c05
On a fresh Windows install, I once saw the NVIDIA GPU in Device Manager
As soon as I start the VM
1. GPU goes into D3 / low power state
2. VFIO attempts reset
3. GPU fails to recover
4. Eventually becomes unresponsive: setpci → ffff ffff
Only recovery is full power cycle (PSU OFF)
dmesg logs (key errors)
pcieport 0000:00:06.0: Data Link Layer Link Active not set in 100 msec
vfio-pci 0000:02:00.0: timed out waiting for pending transaction; performing function level reset anyway
vfio-pci 0000:02:00.0: not ready 1023ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 2047ms after FLR; waiting
vfio-pci 0000:02:00.0: not ready 4095ms after FLR; waiting
The GPU exposes only 2 reset methods: cat /sys/bus/pci/devices/0000:02:00.0/reset_method -> flr bus
Reset type Result FLR hangs ("not ready after FLR") Bus reset (bridge) GPU becomes unresponsive (ffff)
| Reset type | Result |
|---|---|
| FLR | hangs ("not ready after FLR") |
| Bus reset (bridge) | GPU becomes unresponsive (ffff) |
Attempting echo none > reset_method fails Invalid reset method 'none'
PCIe link status
lspci -vv -s 00:06.0
lspci -vv -s 02:00.0
- Root port max: Gen3 x16 (8GT/s)
- GPU capable: Gen5
- Running: Gen3 x16 (downgraded) Link appears stable when idle.
BIOS:
VT-d enabled
Above 4G decoding enabled
ASPM disabled
PCIe forced to Gen3
Secure Boot disabled (kernel lockdown = none)
Kernel parameters:
intel_iommu=on iommu=pt
pcie_aspm=off
pcie_port_pm=off
vfio-pci.disable_idle_d3=1
VFIO:
- Devices bound correctly
- Verified via lspci -nnk
Reset attempts:
- Secondary bus reset → breaks GPU
- FLR → hangs
- reset_method=none → not supported
Other attempts:
- Different VM configs
- Fresh Windows install
- NVIDIA drivers (fail to initialize GPU)
- Verified GPU visible only once (after reinstall)
This appears to be : A broken GPU reset path (FLR) combined with unsafe bus reset on this root port (00:06.0), which leads to:
- VFIO forcing FLR → failure
- Bus reset → hardware becomes inaccessible
- No valid reset fallback available
2. Is there any way to disable FLR in vfio-pci or QEMU in newer kernels?
3. Could this be:
- BIOS / PCIe firmware issue?
- Root port (00:06.0) limitations
- Are there known workarounds besides:
- different motherboard / slot
- or avoiding passthrough entirely?