amd passthrough to vm, start with error message: error writing '1' to '/sys/bus/pci/devices/0000:06:00.0/reset': Inappropriate ioctl for device

wustrong

New Member
Dec 17, 2024
11
0
1
start vm success, but prompt that:
Code:
error writing '1' to '/sys/bus/pci/devices/0000:06:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:06:00.0', but trying to continue as not all devices need a reset

randomly, the vm and host both crashes, other vm still work well.


my vm config:
Code:
root@yoyoyaya:~# cat /etc/pve/qemu-server/101.conf
boot: order=scsi0;net0
cores: 6
cpu: host
hostpci0: 0000:06:00.0,pcie=1,x-vga=1
machine: q35
memory: 8092
meta: creation-qemu=9.0.2,ctime=1734102676
name: fnos
net0: virtio=BC:24:11:58:64:5A,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
sata1: /dev/disk/by-id/nvme-Great_Wall_GT50_2TB_BC07986211160937580,size=2000398680K
scsi0: local-lvm:vm-101-disk-0,iothread=1,size=80G
scsihw: virtio-scsi-single
smbios1: uuid=3716a56a-3faa-40f0-b354-a4309828cb8e
sockets: 1
startup: order=2,up=60,down=60
usb0: host=8-1
usb1: host=152d:a561
vmgenid: 22906b1d-4dee-45c5-a0f3-dce6f23ccb6f

passthrough config:
Code:
cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci

cat /etc/modprobe.d/pve-blacklist.conf
blacklist nvidiafb
blacklist amdgpu

cat /etc/modprobe.d/amdgpu.conf
softdep amdgpu pre: vfio-pci

cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1002:1681

lspci -k | grep -A 3 VGA
06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] (rev 0a)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
 
Last edited:
journalctl -p err -f
Code:
Dec 24 05:22:43 yoyoyaya kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SBRG.EC0], AE_NOT_FOUND (20230628/dswload2-162)
Dec 24 05:22:43 yoyoyaya kernel: ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
Dec 24 05:22:43 yoyoyaya kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SBRG.EC0.OKEC], AE_NOT_FOUND (20230628/psargs-330)
Dec 24 05:22:43 yoyoyaya kernel: ACPI Error: Aborting method \_SB.GPIO._EVT due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
Dec 24 05:22:43 yoyoyaya kernel: hub 6-0:1.0: config failed, hub doesn't have any ports! (err -19)
Dec 24 05:22:45 yoyoyaya smartd[817]: Device: /dev/sda [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 24 05:22:45 yoyoyaya smartd[817]: Device: /dev/sdb [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 24 05:42:07 yoyoyaya kernel: Fixing recursive fault but reboot is needed!
Dec 24 05:42:07 yoyoyaya kernel: BUG: scheduling while atomic: pveproxy worker/1174/0x00000000
Dec 24 05:43:07 yoyoyaya kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Dec 24 05:43:07 yoyoyaya kernel: rcu:         Tasks blocked on level-0 rcu_node (CPUs 0-15): P1174/1:b..l
Dec 24 05:43:07 yoyoyaya kernel: rcu:         (detected by 12, t=60002 jiffies, g=57529, q=4663 ncpus=16)
Dec 24 05:46:07 yoyoyaya kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Dec 24 05:46:07 yoyoyaya kernel: rcu:         Tasks blocked on level-0 rcu_node (CPUs 0-15): P1174/1:b..l
Dec 24 05:46:07 yoyoyaya kernel: rcu:         (detected by 12, t=240007 jiffies, g=57529, q=10476 ncpus=16)
 
When I start a virtual machine with a passthrough discrete graphics card, I get this error:

error writing '1' to '/sys/bus/pci/devices/0000:01:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:01:00.0', but trying to continue as not all devices need a reset
swtpm_setup: Not overwriting existing state file.
kvm: -device vfio-pci,host=0000:01:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0,rombar=0: vfio 0000:01:00.0: failed to setup container for group 14: Failed to set group container: Invalid argument
stopping swtpm instance (pid 6667) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1

Will purchasing a license and updating from a paid repository fix the problem?
 
Will purchasing a license and updating from a paid repository fix the problem?
The license is open-source AGPL3.0. Do you mean a support subscription? The enterprise repository always contains older (or the same) packages than the no-subscription (or test) and does not contain more fixes. PCI(e) passthrough cannot be guaranteed and I doubt that a paid support ticket would help in this case.

I believe your error message suggest a HP specific issue that might be fixed by a BIOS setting but I don't remember the details (as I don;t use HP). Maybe search for 'Failed to set group container: Invalid argument' in old threads on this forum.
EDIT: Turns out I'm wrong about it being a HP (issue).
 
Last edited:
The license is open-source AGPL3.0. Do you mean a support subscription? The enterprise repository always contains older packages than the no-subscription (or test) and does not contain more fixes. PCI(e) passthrough cannot be guaranteed and I doubt that a paid support ticket would help in this case.

I believe your error message suggest a HP specific issue that might be fixed by a BIOS setting but I don't remember the details (as I don;t use HP). Maybe search for 'Failed to set group container: Invalid argument' in old threads on this forum.
My PC configuration is as follows:
Motherboard - MSI MPG X870E CARBON WIFI
Processor - AMD Ryzen 9 9950X3D
Discrete graphics card - NVIDIA GeForce GT 1030 ASUS 2Gb (GT1030-SL-2G-BRK)
Integrated graphics card - Granite Ridge Radeon Graphics
BIOS versions E7E49AMSI.1A80 - 08 jan 2026
I've checked the graphics card passthrough settings in GRUB and other settings multiple times. The integrated graphics card also doesn't passthrough. The server stops working completely, and only a power cycle helps.
I tried disabling the integrated GPU in the BIOS, but that didn't help.

I've read a lot of forum threads about this error, but they recommend installing an older kernel. I don't want to install anything older.
Maybe some BIOS settings can fix the problem?
 
Last edited: