ProxMox - 6.8.12-11-pve: Issue with Nvidia 50 Series GPU Passthrough - "error writing '1' to '/sys/bus/pci/devices/0000:01:00.0/reset': Inappropriate

duantless

New Member
Jul 16, 2025
4
1
1
I am kind of stuck on this. I have read dozens of posts and seen nothing that really has addressed this issue.

1753240799556.pngSome state this is a AMD GPU problem but I have Nvidia.

Other state the issue is a bad update from a year ago and got back to an older verision of software. pt-get install pve-firmware=3.13-3 | apt-get install qemu-server=8.2.3 | apt-get install libpve-common-perl=8.2.4 These are really old verisions and not sure if going back to them is the right choice.

1753240749503.png

I also saw the above, which was in test but should have been live months ago.

I could really use some help in narrowing this downa dn resolving the issue. Any help would beappreciated
 

Attachments

  • 1753240344163.png
    1753240344163.png
    20.8 KB · Views: 4
Hello duantless! Do you also have issues with kernels 6.8.12-13 or 6.14.8-2~bpo12+1 ? These versions include some PCI Passthrough fixes, although by looking at the error message I think these fixes might be unrelated. I still wanted to ask, nevertheless.

Some further questions:
  1. Which GPU do you have, exactly? Please include the exact model, including manufacturer.
  2. Which motherboard do you have, and which BIOS version does it have?
  3. At this point it's unclear to me whether the issue happens when updating all 3 packages at once (sounds like it from your description) or whether it also happens when updating a certain package only. If the issue did not occur previously, I would like to narrow it down to a certain update (and then further down to a certain patch). If you have time to test, please let me know which package(s) break when updating, and please let me know the last version that worked before updating.
  4. Please provide us with the output of dmesg on a non-working setup (preferably fully up-to-date, if possible).
 
Last edited:

l.leahu-vladucu

Sorry for the delay in answering. I just your message right before I left for vacation and did not have the time to get the stuff you asked for.

  1. Which GPU do you have, exactly? Please include the exact model, including manufacturer: GIGABYTE - NVIDIA GeForce RTX 5070 Ti GAMING OC 16G GDDR7 PCI Express 5.0 Graphics Card - Black | Model: GV-N507TGAMING OC-16GD
  2. Which motherboard do you have, and which BIOS version does it have?: GIGABYTE Z690 AORUS ELITE AX DDR4 | BIOS AORUS Version F6 Date 12/17/2021 ID: 8AADL006
  3. At this point it's unclear to me whether the issue happens when updating all 3 packages at once (sounds like it from your description) or whether it also happens when updating a certain package only. If the issue did not occur previously, I would like to narrow it down to a certain update (and then further down to a certain patch). If you have time to test, please let me know which package(s) break when updating, and please let me know the last version that worked before updating.: Confused about the updating all three packages at once. I have not updated any of them. I am on newer verisions and the ones from the comments above are at least 6-8 iterations out of date from where I am.
  4. Please provide us with the output of dmesg on a non-working setup (preferably fully up-to-date, if possible).
Do you also have issues with kernels 6.8.12-13 or 6.14.8-2~bpo12+1 ? I have not tried any of the other verisions of the kernal other than what came with the orginal install.


I ran an update and installed everything, then restarted. below is a current list of installed packages.

proxmox-ve: 8.4.0 (running kernel: 6.8.12-11-pve)
pve-manager: 8.4.5 (running version: 8.4.5/57892e8e686cb35b)
proxmox-kernel-helper: 8.1.4
proxmox-kernel-6.8.12-13-pve-signed: 6.8.12-13
proxmox-kernel-6.8: 6.8.12-13
proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11
proxmox-kernel-6.8.12-9-pve-signed: 6.8.12-9
ceph-fuse: 18.2.7-pve1
corosync: 3.1.9-pve1
criu: 3.17.1-2+deb12u1
frr-pythontools: 10.2.2-1+pve1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.30-pve2
libproxmox-acme-perl: 1.6.0
libproxmox-backup-qemu0: 1.5.2
libproxmox-rs-perl: 0.3.5
libpve-access-control: 8.2.2
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.1.2
libpve-cluster-perl: 8.1.2
libpve-common-perl: 8.3.2
libpve-guest-common-perl: 5.2.2
libpve-http-server-perl: 5.2.2
libpve-network-perl: 0.11.2
libpve-rs-perl: 0.9.4
libpve-storage-perl: 8.3.6
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.6.0-2
proxmox-backup-client: 3.4.3-1
proxmox-backup-file-restore: 3.4.3-1
proxmox-backup-restore-image: 0.7.0
proxmox-firewall: 0.7.1
proxmox-kernel-helper: 8.1.4
proxmox-mail-forward: 0.3.3
proxmox-mini-journalreader: 1.5
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.12
pve-cluster: 8.1.2
pve-container: 5.3.0
pve-docs: 8.4.0
pve-edk2-firmware: 4.2025.02-4~bpo12+1
pve-esxi-import-tools: 0.7.4
pve-firewall: 5.1.2
pve-firmware: 3.16-3
pve-ha-manager: 4.0.7
pve-i18n: 3.4.5
pve-qemu-kvm: 9.2.0-7
pve-xtermjs: 5.5.0-2
qemu-server: 8.4.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.8-pve


Once I restarted the server, I ran dmesg into a file. Then I started the VM that I am doing the GPU passthrough to which is failling. Then appended the DMESG to the same file I created with the orginal dsmeg before the GPU Passthrough Machine did not start.

Let me know if there are anymore questions or anything else I can provide.

duantless

 

Attachments

Sorry for the delay. Your actual issue seems to be this:
[Jul30 21:13] vfio-pci 0000:01:00.0: vgaarb: deactivate vga console
[ +0.000005] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ +1.569541] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ +0.999945] pcieport 0000:00:01.0: retraining failed
[ +1.001957] pcieport 0000:00:01.0: Data Link Layer Link Active not set in 1000 msec
[ +0.000041] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.060116] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000080] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=io+mem:owns=none
[ +0.000216] vfio-pci 0000:01:00.0: vgaarb: deactivate vga console
[ +0.000004] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ +0.000001] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000144] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.059575] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000004] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.661856] vfio-pci 0000:01:00.0: timed out waiting for pending transaction; performing function level reset anyway

Could you please also provide us with:
  1. The VM configuration - that is, the output of qm config 102
  2. The output of lspci -nnk from the Proxmox VE host
  3. The output of pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" from the Proxmox VE host
 
Laurențiu,

I ran the commands you requested and attached them.

I restarted the server and ran the commands. Those are the files that end in Before.

Then I tried starting the VM with the PCI GPU card and got the error. Then ran all the commands again. Those are the files labeled After.

Let me know if there is anything else I can do or provide.
 

Attachments

Okay,

I looked at one of you prevoius emails and say were you talked about what the actuall problem was. I figured from researching that this is most likely a configuration issues. So I decided to redo the configuration and record everything so I could provide when it still did not work.

So that fixed. I had tried several diffferent configuration documents and while I am not sure what was not right in them, something was.

https://akashrajvanshi.medium.com/step-by-step-guide-for-proxmox-gpu-passthrough-6e885898fdae

I use the following link above which was differnet that one I had used before and everything worked.

I have another issue but I will look at opening another ticket on that later.

Thanks for all your help.
 
  • Like
Reactions: uzumo
Glad to hear it works now! For future reference, I would recommend looking at the following pages for the most up-to-date documentation:
  1. Proxmox VE documentation on PCI(e) Passthrough
  2. Wiki page on PCI Passthrough for more details and workarounds

Please always read the documentation very carefully.

Also, please note that PCI(e) passthrough is not guaranteed to work, but depends on the hardware - that is, the combination of GPU and motherboard. Sometimes, updating the BIOS might improve the situation (e.g. improve IOMMU grouping and thus allow GPU passthrough). This is why it's important to check all the commands explained in the documentation to understand what the actual issue is and attempt to fix it. When in doubt, you can, of course, post on this forum and we'll try to help.


Last but not least, to your actual issue, I found the following in your logs, which seems to happen due to a typo in a .conf file in /etc/modprobe.d/:
[ +0.004827] vfio_pci: invalid id string "(10de:2c05"
[ +0.000149] vfio_pci: add [10de:22e9[ffffffff:ffffffff]] class 0x000000/00000000
So device 10de:2c05 at PCI address 01:00.0, which is the graphics card itself, was not passed through due to the invalid ID string, but device 10de:22e9 at PCI address 01:00.1 was passed through (the audio device):
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2c05] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:4181]
Kernel modules: nvidiafb, nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e9] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:0000]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
Which then caused the following error:
[ +0.000395] pci 0000:01:00.1: extending delay after power-on from D3hot to 20 msec
[ +0.000020] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
Which then resulted in the errors at the end:
[Jul30 21:13] vfio-pci 0000:01:00.0: vgaarb: deactivate vga console
[ +0.000005] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ +1.569541] pcieport 0000:00:01.0: broken device, retraining non-functional downstream link at 2.5GT/s
[ +0.999945] pcieport 0000:00:01.0: retraining failed
[ +1.001957] pcieport 0000:00:01.0: Data Link Layer Link Active not set in 1000 msec
[ +0.000041] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.060116] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000080] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=io+mem:owns=none
[ +0.000216] vfio-pci 0000:01:00.0: vgaarb: deactivate vga console
[ +0.000004] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[ +0.000001] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000144] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.059575] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000004] vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
[ +0.661856] vfio-pci 0000:01:00.0: timed out waiting for pending transaction; performing function level reset anyway
[Jul30 21:14] pcieport 0000:00:01.0: Data Link Layer Link Active not set in 1000 msec
[ +0.000207] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000004] vfio-pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ +0.000168] vfio-pci 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=io+mem:owns=none
 
Last edited: