AM5, X670E, IOMMU, and ACS(?) - running in circles and could use advice

sicklyboy

Active Member
Nov 18, 2019
2
0
41
Hey folks, I've recently done some hardware upgrades to one of my systems and have found myself having quite a hard time with what I think is an IOMMU grouping issue regarding the devices I want/need to pass through for my usecase (full disclosure, this is a homelab, not a prod stack)


Some details on my system:
Mobo: ASROCK X670e Steel Legend (BIOS 3.50 from 2025-10-2, latest release on the asrock site as of the time of this post)
CPU: AMD Ryzen 9 9900X3D (microcode: Current revision: 0x0b404032)
RAM: 2x 48GB Teamgroup 6400 DDR5 CL32
GPU: Gigabyte RTX 2080 Super (in addition to the 9900X3D onboard graphics)
Additional devices: 4x various nvme disks, 2x sata ssds for boot, 2x spinners for future bulk storage, 1x HP NC552SFP dual-port 10G SFP+ NIC

PVE info:
Code:
root@deadbit-vm-103:~# pveversion -v
proxmox-ve: 9.0.0 (running kernel: 6.14.11-4-pve)
pve-manager: 9.0.11 (running version: 9.0.11/3bf5476b8a4699e2)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
[truncated for brevity]
root@deadbit-vm-103:~# cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs iommu=pt

Usecase for my system: This is my desktop PC, joined as part of my 3-node Proxmox cluster. Day to day I run an Arch Linux VM on it and utilize PCIe passthrough to pass a my RTX2080 Super, the onboard wireless chipset, an nvme disk, and a USB root hub from the mobo (with a number of downstream physical multi-port USB hubs that I have connected). I daily drive this VM as what I use to actually interact with the PC on a daily basis, game, etc, with mouse/keyboard, monitors, various other connected peripherals. I also will run other guest VMs with no hardware passthrough on this node as needed, particularly useful when running maintenance on other nodes.

Problem: Since upgrading to the X670e Steel Legend + 9900X3D (previously using a Gigabyte X570 board + Ryzen 9 3950X, under which it worked perfectly) the Arch VM has been crashing every 1.5-2 days on average and automatically restarts. I finally caught it right when it happened last night and saw this in dmesg on the host:

Code:
Nov 01 02:42:12 deadbit-vm-103 pveproxy[2368]: worker 1699680 finished
Nov 01 02:42:12 deadbit-vm-103 pveproxy[2368]: starting 1 worker(s)
Nov 01 02:42:12 deadbit-vm-103 pveproxy[2368]: worker 1720958 started
Nov 01 02:42:13 deadbit-vm-103 pveproxy[1720957]: worker exit
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.1: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.2: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.3: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.1: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.2: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.3: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:05:00.0: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:05:00.0: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:08:00.0: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:08:00.0: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:14:00.0: resetting
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:14:00.0: reset done
Nov 01 02:42:15 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.1: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.2: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.3: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.1: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.2: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.3: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:05:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:05:00.0: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:08:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:08:00.0: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:14:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:14:00.0: reset done
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: resetting
Nov 01 02:42:16 deadbit-vm-103 kernel: vfio-pci 0000:01:00.0: reset done
Nov 01 02:42:20 deadbit-vm-103 kernel: vfio-pci 0000:14:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0023 address=0xe7a00 flags=0x0000]
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm_do_msr_access: 29 callbacks suppressed
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xc0000410 data 0x0
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored wrmsr: 0xc0000410 data 0x22
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xc0000410 data 0x0
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored wrmsr: 0xc0000410 data 0x22
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xc0000410 data 0x0
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored wrmsr: 0xc0000410 data 0x22
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xc0000410 data 0x0
Nov 01 02:42:27 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored wrmsr: 0xc0000410 data 0x22
Nov 01 02:42:28 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xc0000410 data 0x0
Nov 01 02:42:28 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored wrmsr: 0xc0000410 data 0x22
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm_do_msr_access: 30 callbacks suppressed
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x3a data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0xd90 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x122 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x570 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x571 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x572 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x560 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x561 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x580 data 0x0
Nov 01 02:42:34 deadbit-vm-103 kernel: kvm: kvm [2569]: ignored rdmsr: 0x581 data 0x0
Nov 01 02:43:00 deadbit-vm-103 pmxcfs[1959]: [status] notice: received log
Nov 01 02:43:04 deadbit-vm-103 pmxcfs[1959]: [status] notice: received log

After some googling, the AMD-Vi: Event logged [IO_PAGE_FAULT bit has lead me down the line of reasoning that this seems to be related to IOMMU group conflicts, and this where I'm starting to get lost in the weeds. I can see via the Datacenter -> Resource Mappings workflow that each passed PCIe device other than the nvidia GPU is in its own IOMMU group along with a corresponding 600 Series Chipset PCIe Switch Downstream Port. However, those "downstream port" devices do not show up in the guest-level PCI passthrough selection, and if I create a resource mapping for the Downstream Port device, the VM will fail to start with:

Code:
kvm: -device vfio-pci,host=0000:04:00.0,id=hostpci4,bus=pci.2,addr=0xd: vfio 0000:04:00.0: error getting device from group 17: No such device
Verify all devices in group 17 are bound to vfio-<bus> or pci-stub and not already in use
TASK ERROR: start failed: QEMU exited with code 1

And at this point, I'm not sure where to go. The BIOS doesn't expose an ACS option as far as I can see, which I understand is needed for better breaking down IOMMU groups but I'm not sure if I need that in my case. I'm not even positive in the first place if this issue is being caused because of the combination of the "downstream port" being in the same IOMMU group as the device that's connected under it or if I'm chasing the wrong thing. If it is ACS, I suppose I need a different motherboard, and it seems that consumer-grade AM5 mobos with ACS support aren't super common? And, especially as I host services that are exposed externally, I'm reluctant to enable the ACS override patch due to the security implications.

I've run into so many snags along the way with this hardware upgrade and I'm wishing I never did it. Unfortunately I'm beyond the return period for this CPU now and with how expensive hardware has gotten recently I'm deep enough in the hole that what's a few hundred more for another mobo, right :rolleyes:? I'm lost and feel like I'm going in circles with trying to figure this out, and I would be overwhelmingly grateful for any insight that anyone has to offer.
 
Some additional information that was too long to fit in the original post (I apologize if double posting is a faux pas, and doubly apologize for the absolute deluge of information I've likely over-shared)

IOMMU groupings w/ PCIe devices on the host:
Code:
root@deadbit-vm-103:~# ./iommucheck.sh
IOMMU Group 0 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Root Complex [1022:14d8]
IOMMU Group 10 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Internal GPP Bridge to Bus [C:A] [1022:14dd]
IOMMU Group 11 00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Internal GPP Bridge to Bus [C:A] [1022:14dd]
IOMMU Group 12 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 71)
IOMMU Group 12 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 13 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 0 [1022:14e0]
IOMMU Group 13 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 1 [1022:14e1]
IOMMU Group 13 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 2 [1022:14e2]
IOMMU Group 13 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 3 [1022:14e3]
IOMMU Group 13 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 4 [1022:14e4]
IOMMU Group 13 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 5 [1022:14e5]
IOMMU Group 13 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 6 [1022:14e6]
IOMMU Group 13 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Data Fabric; Function 7 [1022:14e7]
IOMMU Group 14 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104 [GeForce RTX 2080 SUPER] [10de:1e81] (rev a1)
IOMMU Group 14 01:00.1 Audio device [0403]: NVIDIA Corporation TU104 HD Audio Controller [10de:10f8] (rev a1)
IOMMU Group 14 01:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
IOMMU Group 14 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI Controller [10de:1ad9] (rev a1)
IOMMU Group 15 02:00.0 Ethernet controller [0200]: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter [4040:0100] (rev 42)
IOMMU Group 15 02:00.1 Ethernet controller [0200]: NetXen Incorporated NX3031 Multifunction 1/10-Gigabit Server Adapter [4040:0100] (rev 42)
IOMMU Group 16 03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Upstream Port [1022:43f4] (rev 01)
IOMMU Group 17 04:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 17 05:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E16 PCIe4 NVMe Controller [1987:5016] (rev 01)
IOMMU Group 18 04:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 18 06:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
IOMMU Group 19 04:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 1 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge [1022:14da]
IOMMU Group 20 04:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 20 08:00.0 Network controller [0280]: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter [14c3:0616]
IOMMU Group 21 04:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 21 09:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller [10ec:8125] (rev 05)
IOMMU Group 22 04:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0a:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Upstream Port [1022:43f4] (rev 01)
IOMMU Group 22 0b:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0b:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 22 0c:00.0 Non-Volatile memory controller [0108]: Sandisk Corp SanDisk Ultra 3D / WD Blue SN570 NVMe SSD (DRAM-less) [15b7:501a]
IOMMU Group 22 11:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E12 NVMe Controller [1987:5012] (rev 01)
IOMMU Group 22 12:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset USB 3.2 Controller [1022:43f7] (rev 01)
IOMMU Group 22 13:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset SATA Controller [1022:43f6] (rev 01)
IOMMU Group 23 04:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 23 14:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset USB 3.2 Controller [1022:43f7] (rev 01)
IOMMU Group 24 04:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset PCIe Switch Downstream Port [1022:43f5] (rev 01)
IOMMU Group 24 15:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 600 Series Chipset SATA Controller [1022:43f6] (rev 01)
IOMMU Group 25 16:00.0 Non-Volatile memory controller [0108]: Sandisk Corp SanDisk Ultra 3D / WD Blue SN570 NVMe SSD (DRAM-less) [15b7:501a]
IOMMU Group 26 17:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics] [1002:13c0] (rev ca)
IOMMU Group 27 17:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Radeon High Definition Audio Controller [Rembrandt/Strix] [1002:1640]
IOMMU Group 28 17:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 19h PSP/CCP [1022:1649]
IOMMU Group 29 17:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI [1022:15b6]
IOMMU Group 2 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge GPP Bridge [1022:14db]
IOMMU Group 30 17:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 3.1 xHCI [1022:15b7]
IOMMU Group 31 17:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h/1ah HD Audio Controller [1022:15e3]
IOMMU Group 32 18:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8]
IOMMU Group 3 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge GPP Bridge [1022:14db]
IOMMU Group 4 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge [1022:14da]
IOMMU Group 5 00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge GPP Bridge [1022:14db]
IOMMU Group 6 00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge GPP Bridge [1022:14db]
IOMMU Group 7 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge [1022:14da]
IOMMU Group 8 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge [1022:14da]
IOMMU Group 9 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge Dummy Host Bridge [1022:14da]
root@deadbit-vm-103:~#

My vmid.conf:
Code:
root@deadbit-vm-103:~# cat /etc/pve/qemu-server/102.conf
agent: 1
args: -cpu host,-hypervisor,kvm=off, -smbios type=0,vendor="American Megatrends International",version=F40b,date="07/11/2024"
balloon: 0
boot: order=scsi0
cores: 20
cpu: host,hidden=1
hostpci0: 0000:01:00,pcie=1,x-vga=1
hostpci1: mapping=VM-103-USB4
hostpci2: mapping=VM-103-WIFI
hostpci3: mapping=VM103-GAME-SSD
machine: q35
memory: 64000
meta: creation-qemu=9.0.0,ctime=1722472554
name: desktop-arch
net0: virtio=BC:24:11:82:17:C8,bridge=vmbr0,firewall=1,tag=750
numa: 0
ostype: l26
protection: 1
scsi0: /dev/disk/by-id/nvme-WD_Blue_SN570_1TB_232361804339,replicate=0,size=976762584K
scsihw: virtio-scsi-pci
smbios1: base64=1,family=WDU3MCBNQg==,manufacturer=R2lnYWJ5dGUgVGVjaG5vbG9neSBDby4sIEx0ZC4=,product=WDU3MCBBT1JVUyBFTElURSBXSUZJ,uuid=3ea613c3-a807-4bef-bc48-7cf3e84233e1
sockets: 1
tablet: 0
vga: virtio
vmgenid: a4b5c1d9-e9bf-45c7-823f-cf8830ecfc77
root@deadbit-vm-103:~#

The 3 mappings in it are as follows, as well
0000:14:00.0 -> VM-103-USB4
0000:08:00.0 -> VM-103-WIFI
0000:05:00.0 -> VM103-GAME-SSD

Finally, some BIOS options that I believe may be pertinent but so far haven't lead to anything improving, where changed:
Above 4G Decoding = enabled
Resize BAR = disabled
Above 4GB MMIO Limit = 40bit (1TB)
SR-IOV support = Enabled
IOMMU = Enabled
PCIe ARI Support = Enabled (default is auto, still crashes while Enabled)
Advanced Error Reporting (AER) = Supported (default is auto, changed as that would allegedly expose the ACS options, still crashes when set to Supported)
PCIe ARI enumeration = Enable (default is auto, changed earlier today, no crash yet but this "issue" seems to only happen every 1.5-2 days and I have not had host or VM up for that long yet after a reboot a few hours ago, so I believe it's too soon to make a determination)

Again, I would be astronomically appreciative of any suggestions, and apologize for the two walls of text
 
Last edited: