[SOLVED] Yet another AMD Reset Bug/Vendor Reset Thread

VTECnKitkats

New Member
May 4, 2026
8
1
1
Hello all, I am a very green linux/home NAS user and am having a reset issue on windows 10 VM on Proxmox 9.1.9 with my 5700 XT. I am running my system with a single GPU headless and want the ability to access my NAS (contained within the same physical system) to stream from Jellyfin/Immich/etc. and windows is the only solution my wife agreed to use. I have tried what feels the kitchen sink to get the GPU passthrough initially (which now works... relaibly?) and then device reset to work. I have followed around a dozen guides and forum posts for fixes but nothing seems to work. Currently, this is my VM setup from vmconfig:

Code:
agent: 1
bios: ovmf
boot: order=scsi0;net0
cores: 6
cpu: host
efidisk0: local-lvm:vm-169-disk-0,efitype=4m,ms-cert=2023k,pre-enrolled-keys=1,size=4M
hostpci0: 0000:28:00,pcie=1,rombar=0
ide0: local:iso/virtio-win-0.1.285.iso,media=cdrom,size=771138K
ide2: local:iso/Windows.iso,media=cdrom,size=4779200K
machine: pc-q35-6.2
memory: 8192
meta: creation-qemu=10.1.2,ctime=1777839856
name: windowsVM
net0: rtl8139=BC:24:11:32:90:93,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-lvm:vm-169-disk-1,iothread=1,size=250G
scsihw: virtio-scsi-single
smbios1: uuid=80088a79-4793-4ef4-b942-69b009fc8cc2
sockets: 1
usb0: host=1-9
vga: none
vmgenid: f7404ec1-3457-437b-a94a-2d7a142b6a4c

For some reason q35-6.2 is the only machine version that will result in a display output reliably. 6.0 worked a couple of times but then stopped.

This is my /etc/default/grub:

Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`( . /etc/os-release && echo ${NAME} )`
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off,efifb:off"
GRUB_CMDLINE_LINUX=""

I have installed
Code:
vendor-reset
from github and made a bunch of other changes based on forum suggestions I can't even remember. From dmesg I do get the following:

Code:
[  117.590893] vfio-pci 0000:28:00.0: resetting
[  117.591058] vfio-pci 0000:28:00.0: reset done
[  117.617389] vfio-pci 0000:28:00.0: resetting
[  117.617467] vfio-pci 0000:28:00.1: resetting
[  117.744717] vfio-pci 0000:28:00.0: reset done
[  117.744770] vfio-pci 0000:28:00.1: reset done
[  123.801122] usb 1-9: reset full-speed USB device number 2 using xhci_hcd

which referrs to 0000:28:00.0 being the 5700 XT.

Without a host reboot if I try to shut down and then start the VM I get the following error:

Code:
error writing '1' to '/sys/bus/pci/devices/0000:28:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:28:00.0', but trying to continue as not all devices need a reset
kvm: ../hw/pci/pci.c:1815: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
TASK ERROR: start failed: QEMU exited with code 1

Thank you in advance. If you need any more information, please include instructions on how I would generate/access it as I am on around day 3 of working with Linux and, as I said, have no clue what I am doing! I am, however, open to roasting and learning.
 
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off,efifb:off"
video=vesafb:off,efifb:off is invalid and does nothing (you need initcall_blacklist=sysfb_init instead but you lose the Proxmox host console which make debugging more difficult). With vendor-reset working, you don't need such work-arounds (like nofb nomodeset initcall_blacklist=sysfb_init video=...) anyway.

I have installed
Code:
vendor-reset
from github and made a bunch of other changes based on forum suggestions I can't even remember. From dmesg I do get the following:

Code:
[  117.590893] vfio-pci 0000:28:00.0: resetting
[  117.591058] vfio-pci 0000:28:00.0: reset done
[  117.617389] vfio-pci 0000:28:00.0: resetting
[  117.617467] vfio-pci 0000:28:00.1: resetting
[  117.744717] vfio-pci 0000:28:00.0: reset done
[  117.744770] vfio-pci 0000:28:00.1: reset done
[  123.801122] usb 1-9: reset full-speed USB device number 2 using xhci_hcd
The expected messages from vendor-reset are missing. Either it is not installed or it is not enabled for the GPU. vendor-reset does not compile with the recent Linux kernel versions of Proxmox. You need to patch it manually (did you find the fix on the vendor-reset github?) or use a branch with the fix.

You also need to do echo device_specific >"/sys/bus/pci/devices/0000:28:00.0/reset_method" after loading the vendor-reset module (if it was build and installed correctly) but before starting the VM. Can you reboot your Proxmox and show the output of this command (before starting the VM)? That will tell us where the problem with vendor-reset most likely is.

EDIT: This is the fix for kernel versions since 6.14: https://github.com/gnif/vendor-reset/pull/103
 
Last edited:
I have deleted video=vesafb:off,efifb:off and nofb nomodeset and added initcall_blacklist=sysfb_init to my grub file.

Can you explain how I should execute echo device_specific >"/sys/bus/pci/devices/0000:28:00.0/reset_method" and how I would show the outout? Running it in shell within root returns a write error: Invalid arguement.
 
video=vesafb:off,efifb:off is invalid and does nothing (you need initcall_blacklist=sysfb_init instead but you lose the Proxmox host console which make debugging more difficult). With vendor-reset working, you don't need such work-arounds (like nofb nomodeset initcall_blacklist=sysfb_init video=...) anyway.


The expected messages from vendor-reset are missing. Either it is not installed or it is not enabled for the GPU. vendor-reset does not compile with the recent Linux kernel versions of Proxmox. You need to patch it manually (did you find the fix on the vendor-reset github?) or use a branch with the fix.

You also need to do echo device_specific >"/sys/bus/pci/devices/0000:28:00.0/reset_method" after loading the vendor-reset module (if it was build and installed correctly) but before starting the VM. Can you reboot your Proxmox and show the output of this command (before starting the VM)? That will tell us where the problem with vendor-reset most likely is.
didn't reply, see above, sorry.
 
I have deleted video=vesafb:off,efifb:off and nofb nomodeset and added initcall_blacklist=sysfb_init to my grub file.
This makes debugging harder because the Proxmox host console no longer works. You also don't need this when vendor-reset works. Please undo this.
Can you explain how I should execute echo device_specific >"/sys/bus/pci/devices/0000:28:00.0/reset_method" and how I would show he outout?
Run the command on the Proxmox host after a reboot, after loading the vendor-reset module but before starting the VM. Then show the output in your reply post. Maybe use SSH and copy the output and put it in CODE-tags?
Running it in shell within root returns a write error: Invalid arguement.
Then vendor-reset is not loaded and/or not (properly) installed. Or maybe it did not build correctly? What is the output of modprobe vendor-reset?
 
  • Like
Reactions: Johannes S
This makes debugging harder because the Proxmox host console no longer works. You also don't need this when vendor-reset works. Please undo this.

Run the command on the Proxmox host after a reboot, after loading the vendor-reset module but before starting the VM. Then show the output in your reply post. Maybe use SSH and copy the output and put it in CODE-tags?

Then vendor-reset is not loaded and/or not (properly) installed. Or maybe it did not build correctly? What is the output of modprobe vendor-reset?
I'm sorry, I am not sure what you mean by "after loading the vendor-reset module but before starting the VM." How do I load the vendor reset module?

The output of modprobe vendor-reset is
Code:
modprobe: FATAL: Module vendor-reset not found in directory /lib/modules/7.0.0-3-pve

This is after a host reboot and doing nothing else. VM is still off.
 
I'm sorry, I am not sure what you mean by "after loading the vendor-reset module but before starting the VM." How do I load the vendor reset module?
You run modprobe vendor-reset or make sure it is loaded at boot. I often describe the possible problem and then describe the command for it, in order to give feedback on what might be going wrong.
The output of modprobe vendor-reset is
Code:
modprobe: FATAL: Module vendor-reset not found in directory /lib/modules/7.0.0-3-pve
vendor-reset is not installed properly, which is probably caused by it not compiling properly on kernel versions after 6.12. vendor-reset will fix your GPU passthrough problem but it is (no longer) as simple as it's documentation suggest.

What steps did you take to clone, build and install it? Can you maybe try a fresh clone from github, patch the source code (see https://github.com/gnif/vendor-reset/pull/103 ) and run the right command (with a dot at the end, which some people don't see) and show the output?
 
You run modprobe vendor-reset or make sure it is loaded at boot. I often describe the possible problem and then describe the command for it, in order to give feedback on what might be going wrong.

vendor-reset is not installed properly, which is probably caused by it not compiling properly on kernel versions after 6.12. vendor-reset will fix your GPU passthrough problem but it is (no longer) as simple as it's documentation suggest.

What steps did you take to clone, build and install it? Can you maybe try a fresh clone from github, patch the source code (see https://github.com/gnif/vendor-reset/pull/103 ) and run the right command (with a dot at the end, which some people don't see) and show the output?
I do not know how to patch the source code, sorry.

After running git clone https://github.com/gnif/vendor-reset.git I get the output fatal: destination path 'vendor-reset' already exists and is not an empty directory.

When I run dkms install . I get the output
Code:
Error! DKMS tree already contains: vendor-reset/0.1.1
You cannot add the same module/version combo more than once.

I then ran
Code:
echo "vendor-reset" >> /etc/modules
update-initramfs -u

which received the output
Code:
update-initramfs: Generating /boot/initrd.img-7.0.0-3-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
 
After running git clone https://github.com/gnif/vendor-reset.git I get the output fatal: destination path 'vendor-reset' already exists and is not an empty directory.
Maybe remove that directory or rename it, in order to try from scratch? I don't know what you did to the source code before (in an honest attempt to fix issues as described by other posts and guides).
When I run dkms install . I get the output
Code:
Error! DKMS tree already contains: vendor-reset/0.1.1
You cannot add the same module/version combo more than once.
Sounds like it compiled before and it got installed. You might need to uninstall it. I found this more than once looking for a way to do this: https://askubuntu.com/questions/849547/how-do-i-uninstall-dkms-modules-if-there-are-two-of-them . Of course you have to change the module and version to match. I don't understand why it might compile and install cleanly and not be possible to load it using modprobe. Maybe it was compiled and installed on a previous Linux kernel version before?
I then ran
Code:
echo "vendor-reset" >> /etc/modules
update-initramfs -u

which received the output
Code:
update-initramfs: Generating /boot/initrd.img-7.0.0-3-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
This does not really do anything except that the kernel will try to load the module at boot. This does not really help in troubleshooting this issue. I'm surprised that your Proxmox boot is not under the control of proxmox-boot-tool. How did you install it? What version did you install initially (a very old one and upgraded over the year maybe)?
 
Maybe remove that directory or rename it, in order to try from scratch? I don't know what you did to the source code before (in an honest attempt to fix issues as described by other posts and guides).

Sounds like it compiled before and it got installed. You might need to uninstall it. I found this more than once looking for a way to do this: https://askubuntu.com/questions/849547/how-do-i-uninstall-dkms-modules-if-there-are-two-of-them . Of course you have to change the module and version to match. I don't understand why it might compile and install cleanly and not be possible to load it using modprobe. Maybe it was compiled and installed on a previous Linux kernel version before?

This does not really do anything except that the kernel will try to load the module at boot. This does not really help in troubleshooting this issue. I'm surprised that your Proxmox boot is not under the control of proxmox-boot-tool. How did you install it? What version did you install initially (a very old one and upgraded over the year maybe)?
Firstly, thank you for your help on this. I have seen you around on the forum as I have been troubleshooting this and it seems like a bunch of people have gotten results from what you do in your free time. I appreciate it.

I attempted to follow the uninstallation instructions you linked and got it to output something about uninstalling something initally. Running it again outputs Error! The module/version combo: vendor-reset/0.1.1 is not located in the DKMS tree. The same occurs for version 0.1 which I have also seen somewhere and could have installed using some guide at some point.

Attempting to clone the the vendor-reset github through https://github.com/gnif/vendor-reset.git returns the same fatal: destination path 'vendor-reset' already exists and is not an empty directory which would indicate that it still exists?!

Any idea what directory the vendor-resetstuff sits in? And how I would go about removing it?

I am not above doing a completely clean install of proxmox on the host. I just have two VMs running and no actual data backed up as I'm waiting on the drives to arrive in the mail. I would just like to try something quick if it is possible to fix it or have an up-to-date guide to GPU passthrough and fix for the AMD reset bug for Proxmox 9.1.9 and know in advance what settings to apply for the VM ahead of time as I'm sure running these similar commands, changing the machine version, and editing random files have screwed something up.
 
I attempted to follow the uninstallation instructions you linked and got it to output something about uninstalling something initally. Running it again outputs Error! The module/version combo: vendor-reset/0.1.1 is not located in the DKMS tree. The same occurs for version 0.1 which I have also seen somewhere and could have installed using some guide at some point.
sudo dkms remove vendor-reset/0.1.1 --all should do the trick. It is then no longer registered as a module.
Attempting to clone the the vendor-reset github through https://github.com/gnif/vendor-reset.git returns the same fatal: destination path 'vendor-reset' already exists and is not an empty directory which would indicate that it still exists?!


Any idea what directory the vendor-resetstuff sits in? And how I would go about removing it?
I don't know in which directory you are but there seems to exist a sub-directory named vendor-reset. Remove it using rm -r vendor-reset to remove it. Then clone a fresh copy from github.
I am not above doing a completely clean install of proxmox on the host. I just have two VMs running and no actual data backed up as I'm waiting on the drives to arrive in the mail. I would just like to try something quick if it is possible to fix it or have an up-to-date guide to GPU passthrough and fix for the AMD reset bug for Proxmox 9.1.9 and know in advance what settings to apply for the VM ahead of time as I'm sure running these similar commands, changing the machine version, and editing random files have screwed something up.
I don't think a clean reinstall in necessary as most other Proxmox configuration does not interact with vendor-reset or PCI(e) passthrough.

I guess you are not very familiar with github or Linux or source-code? In principle: cloning vendor-reset, manually changing some of the files and then building/installing it and enabling it for your GPU should fix the passthrough problem. I understand that each of those steps is new and needs some hand-holding but that will take some time (but I don't know a better way that is also helpful to others reading this forum) and i might be in a different time zon.
 
sudo dkms remove vendor-reset/0.1.1 --all should do the trick. It is then no longer registered as a module.

I don't know in which directory you are but there seems to exist a sub-directory named vendor-reset. Remove it using rm -r vendor-reset to remove it. Then clone a fresh copy from github.

I don't think a clean reinstall in necessary as most other Proxmox configuration does not interact with vendor-reset or PCI(e) passthrough.

I guess you are not very familiar with github or Linux or source-code? In principle: cloning vendor-reset, manually changing some of the files and then building/installing it and enabling it for your GPU should fix the passthrough problem. I understand that each of those steps is new and needs some hand-holding but that will take some time (but I don't know a better way that is also helpful to others reading this forum) and i might be in a different time zon.
You are right on the money! Until several days ago my extent of computer knowledge is building, running, and troubleshooting problems for friends and my gaming PCs.

rm - r vendor-reset Did the trick to allow me to clone a fresh copy of vendor-reset from github. running dkms install .returned:
Code:
Creating symlink /var/lib/dkms/vendor-reset/0.1.1/source -> /usr/src/vendor-reset-0.1.1

Sign command: /lib/modules/7.0.0-3-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub

Building module(s)...(bad exit status: 2)
Failed command:
make -j12 KERNELRELEASE=7.0.0-3-pve KDIR=/lib/modules/7.0.0-3-pve/build

Error! Bad return status for module build on kernel: 7.0.0-3-pve (x86_64)
Consult /var/lib/dkms/vendor-reset/0.1.1/build/make.log for more information.

I imagine this is what you are referring to "changing some of the files." Could you walk me through that? I will remove it again and then stop after cloning another copy of vendor-reset from github to modify before installing.

Edit: sudo dkms remove vendor-reset/0.1.1 --all was the initial command I ran and when I tried to clone the vendor-reset github it didn't seem to have fully removed the files.
 
Last edited:
Last edited:
Maybe you can show this log file?
Probably you need to adjust some files according to https://github.com/gnif/vendor-reset/pull/103 . And then run dkms install . again.

EDIT: It's probably https://github.com/gnif/vendor-reset/pull/104 instead, make sure line 32 of vendor-reset/src/amd/amdgpu/atom.c uses linux/unaligned.h.
I am experiencing myself learn something new in real time. Edited line 32, which I found by counting down from the top, (I used nano, are there options with numbers on the side for line numbers?) and reinstalled. Rebooted the host, started the VM, shut it down from within the VM, ans started it again no problems! Rebooted the host and did it again to make sure it wasn't a fluke.

Thank you again @leesteken ! On to the next.
 
  • Like
Reactions: leesteken