Hi all, hopefully someone can point me in the right direction.
This is the error:
Which is after the following:
I managed to get vendor-reset added as shown:
which is confirmed by the following (although kernel is tainted due to adding vendor-reset?):
However, the following shows that vendor reset for the device specific method is not working:
The following is the code for setting the device_specific method:
As derived the following information
The following also shows that the reset_method has not being writen to:
So I figured I would try and manually overwrite bus with device-specifc
This is where I was faced the locked status:
Maybe I am overlooking something? There is so much conflicting information I am not sure if what I am doing is correct. It appears I am very close but yet so far..
Additional info:
This is the error:
error writing '1' to '/sys/bus/pci/devices/0000:03:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:03:00.0', but trying to continue as not all devices need a reset
swtpm_setup: Not overwriting existing state file.
stopping swtpm instance (pid 3392) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code 1
Which is after the following:
I managed to get vendor-reset added as shown:
cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vendor-reset
which is confirmed by the following (although kernel is tainted due to adding vendor-reset?):
dmesg | grep vendor_reset
[ 8.081690] vendor_reset: loading out-of-tree module taints kernel.
[ 8.081693] vendor_reset: module verification failed: signature and/or required key missing - tainting kernel
[ 8.097532] vendor_reset_hook: installed
lsmod | grep vendor
vendor_reset 110592 0
However, the following shows that vendor reset for the device specific method is not working:
journalctl -b 0 | grep vfio-pci
vfio-pci 0000:03:00.0: vgaarb: deactivate vga console
vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
Unsupported reset method 'device_specific'
journalctl -b 0 | grep reset
kernel: vendor_reset: loading out-of-tree module taints kernel.
kernel: vendor_reset: module verification failed: signature and/or required key missing - tainting kernel
kernel: vendor_reset_hook: installed
systemd[1]: Started vreset.service - AMD GPU reset method to 'device_specific'.
systemd[1]: vreset.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: vreset.service: Failed with result 'exit-code'.
kernel: vfio-pci 0000:03:00.0: Unsupported reset method 'device_specific'
The following is the code for setting the device_specific method:
Code:
Bash:
cat << EOF >> /etc/systemd/system/vreset.service
[Unit]
Description=AMD GPU reset method to 'device_specific'
After=multi-user.target
[Service]
ExecStart=/usr/bin/bash -c 'echo device_specific > /sys/bus/pci/devices/0000:03:00.0/reset_method'
[Install]
WantedBy=multi-user.target
EOF
systemctl enable vreset.service && systemctl start vreset.service
As derived the following information
lspci -nnks 03:00
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:7550] (rev c0)
Subsystem: ASUSTeK Computer Inc. Device [1043:061a]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:ab40]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:ab40]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
The following also shows that the reset_method has not being writen to:
cat /sys/bus/pci/devices/0000:03:00.0/reset_method
bus
So I figured I would try and manually overwrite bus with device-specifc
systemctl stop vreset.service
nano /sys/bus/pci/devices/0000:03:00.0/reset_method
This is where I was faced the locked status:
[ Error writing lock file /sys/bus/pci/devices/0000:03:00.0/.reset_method.swp: Permission denied ]
Maybe I am overlooking something? There is so much conflicting information I am not sure if what I am doing is correct. It appears I am very close but yet so far..
Additional info:
cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.8.12-10-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt hugepagesz=2M hugepages=24576 nomodeset pcie_acs_override=downstream
cat /etc/mdoprobe.d/vfio.conf
options vfio-pci ids=1002:7550,1002:ab40 disable_vga=1
dmesg | grep -e DMAR -e IOMMU
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[ 0.019803] ACPI: DMAR 0x0000000042767000 000050 (v02 INTEL EDK2 00000002 01000013)
[ 0.019833] ACPI: Reserving DMAR table memory at [mem 0x42767000-0x4276704f]
[ 0.112687] DMAR: IOMMU enabled
[ 0.240798] DMAR: Host address width 39
[ 0.240798] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.240804] DMAR: dmar0: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.240806] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 0
[ 0.240807] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.240807] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.242233] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 7.176231] DMAR: No RMRR found
[ 7.176232] DMAR: No ATSR found
[ 7.176232] DMAR: No SATC found
[ 7.176239] DMAR: dmar0: Using Queued invalidation
[ 7.176669] DMAR: Intel(R) Virtualization Technology for Directed I/O
cat /etc/pve/qemu-server/800.conf
bios: ovmf
boot: order=virtio0
cores: 20
cpu: host,flags=+pcid
efidisk0: zfspool:vm-800-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:03:00,pcie=1
machine: pc-q35-9.2+pve1
memory: 49152
meta: creation-qemu=9.2.0,ctime=1747375148
name: win11
net0: virtio=BC:24:11:7E:72:86,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=c0a45d16-e5a1-411c-87c4-c168cd0f6a0a
sockets: 1
tpmstate0: zfspool:vm-800-disk-1,size=4M,version=v2.0
vga: none
virtio0: zfspool:vm-800-disk-2,cache=writethrough,iothread=1,size=128G
vmgenid: 5acfade8-67fa-4319-9535-ca32db66e29c
Last edited: