amd passthrough to vm, start with error message: error writing '1' to '/sys/bus/pci/devices/0000:06:00.0/reset': Inappropriate ioctl for device

wustrong

New Member
Dec 17, 2024
11
0
1
start vm success, but prompt that:
Code:
error writing '1' to '/sys/bus/pci/devices/0000:06:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:06:00.0', but trying to continue as not all devices need a reset

randomly, the vm and host both crashes, other vm still work well.


my vm config:
Code:
root@yoyoyaya:~# cat /etc/pve/qemu-server/101.conf
boot: order=scsi0;net0
cores: 6
cpu: host
hostpci0: 0000:06:00.0,pcie=1,x-vga=1
machine: q35
memory: 8092
meta: creation-qemu=9.0.2,ctime=1734102676
name: fnos
net0: virtio=BC:24:11:58:64:5A,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
sata1: /dev/disk/by-id/nvme-Great_Wall_GT50_2TB_BC07986211160937580,size=2000398680K
scsi0: local-lvm:vm-101-disk-0,iothread=1,size=80G
scsihw: virtio-scsi-single
smbios1: uuid=3716a56a-3faa-40f0-b354-a4309828cb8e
sockets: 1
startup: order=2,up=60,down=60
usb0: host=8-1
usb1: host=152d:a561
vmgenid: 22906b1d-4dee-45c5-a0f3-dce6f23ccb6f

passthrough config:
Code:
cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci

cat /etc/modprobe.d/pve-blacklist.conf
blacklist nvidiafb
blacklist amdgpu

cat /etc/modprobe.d/amdgpu.conf
softdep amdgpu pre: vfio-pci

cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1002:1681

lspci -k | grep -A 3 VGA
06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] (rev 0a)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
 
Last edited:
journalctl -p err -f
Code:
Dec 24 05:22:43 yoyoyaya kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SBRG.EC0], AE_NOT_FOUND (20230628/dswload2-162)
Dec 24 05:22:43 yoyoyaya kernel: ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
Dec 24 05:22:43 yoyoyaya kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SBRG.EC0.OKEC], AE_NOT_FOUND (20230628/psargs-330)
Dec 24 05:22:43 yoyoyaya kernel: ACPI Error: Aborting method \_SB.GPIO._EVT due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
Dec 24 05:22:43 yoyoyaya kernel: hub 6-0:1.0: config failed, hub doesn't have any ports! (err -19)
Dec 24 05:22:45 yoyoyaya smartd[817]: Device: /dev/sda [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 24 05:22:45 yoyoyaya smartd[817]: Device: /dev/sdb [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 24 05:42:07 yoyoyaya kernel: Fixing recursive fault but reboot is needed!
Dec 24 05:42:07 yoyoyaya kernel: BUG: scheduling while atomic: pveproxy worker/1174/0x00000000
Dec 24 05:43:07 yoyoyaya kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Dec 24 05:43:07 yoyoyaya kernel: rcu:         Tasks blocked on level-0 rcu_node (CPUs 0-15): P1174/1:b..l
Dec 24 05:43:07 yoyoyaya kernel: rcu:         (detected by 12, t=60002 jiffies, g=57529, q=4663 ncpus=16)
Dec 24 05:46:07 yoyoyaya kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Dec 24 05:46:07 yoyoyaya kernel: rcu:         Tasks blocked on level-0 rcu_node (CPUs 0-15): P1174/1:b..l
Dec 24 05:46:07 yoyoyaya kernel: rcu:         (detected by 12, t=240007 jiffies, g=57529, q=10476 ncpus=16)