M2 Wireless card passthrough failing with vfio: error disconnecting group

Lemure

Member
Mar 7, 2021
26
3
23
75
Hi

i am trying to add a new wireless card and pass it to a OpenWRT VM. I have done this in the past successfully with a usb wifi card. Now I am trying to use a M2 wifi card but the passthrough is failing. The system is a Zen 1 embedded device, IOMMU is enabled in the BIOS.

Initially the passthrough failed because the card was loaded in the host, so I blacklisted the driver by adding 'blacklist ath12k' to /etc/modprobe.d/pve-blacklist.conf (although I've read someone in the forums saying that blacklisting is not recommended, open to alternatives). This led to a different error, that I can not figure out:

Code:
kvm: -device vfio-pci,host=0000:03:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio: error disconnecting group 11 from container
kvm: -device vfio-pci,host=0000:03:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:03:00.0: error getting device from group 11: No such device
Verify all devices in group 11 are bound to vfio-<bus> or pci-stub and not already in use
TASK ERROR: start failed: QEMU exited with code 1

Some relevant info:

The M2 Wifi card gets added to IOMMU group 11 alone, so this looks fine:
Code:
cat /proc/cmdline; for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
BOOT_IMAGE=/boot/vmlinuz-6.14.11-1-pve root=/dev/mapper/pve-root ro quiet
IOMMU group 0 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Root Complex [1022:15d0]
IOMMU group 10 01:00.0 Non-Volatile memory controller [0108]: ADATA Technology Co., Ltd. XPG SX8200 Pro PCIe Gen3x4 M.2 2280 Solid State Drive [1cc1:8201] (rev 03)
IOMMU group 11 03:00.0 Network controller [0280]: Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] [17cb:1107] (rev 01)
IOMMU group 12 04:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU group 13 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] [1002:15dd] (rev 83)
IOMMU group 14 05:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio Controller [1002:15de]
IOMMU group 14 05:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
IOMMU group 14 05:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e0]
IOMMU group 14 05:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e1]
IOMMU group 14 05:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD] Audio Coprocessor [1022:15e2]
IOMMU group 14 05:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h/1ah HD Audio Controller [1022:15e3]
IOMMU group 14 05:00.7 Non-VGA unclassified device [0000]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/Renoir Non-Sensor Fusion Hub KMDF driver [1022:15e6]
IOMMU group 1 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 2 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU group 3 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU group 4 00:01.6 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU group 5 00:01.7 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 PCIe GPP Bridge [6:0] [1022:15d3]
IOMMU group 6 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 6 00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus B [1022:15dc]
IOMMU group 6 06:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 61)
IOMMU group 7 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Internal PCIe GPP Bridge 0 to Bus A [1022:15db]
IOMMU group 8 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 61)
IOMMU group 8 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 9 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 0 [1022:15e8]
IOMMU group 9 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 1 [1022:15e9]
IOMMU group 9 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 2 [1022:15ea]
IOMMU group 9 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 3 [1022:15eb]
IOMMU group 9 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 4 [1022:15ec]
IOMMU group 9 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 5 [1022:15ed]
IOMMU group 9 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 6 [1022:15ee]
IOMMU group 9 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 Device 24: Function 7 [1022:15ef]

Code:
root@pve:~# dmesg | grep -i -e DMAR -e IOMMU -e AMD-Vi
[    0.138720] AMD-Vi: Using global IVHD EFR:0x4f77ef22294ada, EFR2:0x0
[    0.399548] iommu: Default domain type: Translated
[    0.399548] iommu: DMA domain TLB invalidation policy: lazy mode
[    0.447926] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.448031] pci 0000:00:00.0: Adding to iommu group 0
[    0.448085] pci 0000:00:01.0: Adding to iommu group 1
[    0.448117] pci 0000:00:01.1: Adding to iommu group 2
[    0.448145] pci 0000:00:01.2: Adding to iommu group 3
[    0.448172] pci 0000:00:01.6: Adding to iommu group 4
[    0.448199] pci 0000:00:01.7: Adding to iommu group 5
[    0.448249] pci 0000:00:08.0: Adding to iommu group 6
[    0.448276] pci 0000:00:08.1: Adding to iommu group 7
[    0.448299] pci 0000:00:08.2: Adding to iommu group 6
[    0.448338] pci 0000:00:14.0: Adding to iommu group 8
[    0.448358] pci 0000:00:14.3: Adding to iommu group 8
[    0.448468] pci 0000:00:18.0: Adding to iommu group 9
[    0.448489] pci 0000:00:18.1: Adding to iommu group 9
[    0.448511] pci 0000:00:18.2: Adding to iommu group 9
[    0.448532] pci 0000:00:18.3: Adding to iommu group 9
[    0.448553] pci 0000:00:18.4: Adding to iommu group 9
[    0.448575] pci 0000:00:18.5: Adding to iommu group 9
[    0.448596] pci 0000:00:18.6: Adding to iommu group 9
[    0.448617] pci 0000:00:18.7: Adding to iommu group 9
[    0.448646] pci 0000:01:00.0: Adding to iommu group 10
[    0.448673] pci 0000:03:00.0: Adding to iommu group 11
[    0.448700] pci 0000:04:00.0: Adding to iommu group 12
[    0.448747] pci 0000:05:00.0: Adding to iommu group 13
[    0.448866] pci 0000:05:00.1: Adding to iommu group 14
[    0.448913] pci 0000:05:00.2: Adding to iommu group 14
[    0.448955] pci 0000:05:00.3: Adding to iommu group 14
[    0.448997] pci 0000:05:00.4: Adding to iommu group 14
[    0.449038] pci 0000:05:00.5: Adding to iommu group 14
[    0.449080] pci 0000:05:00.6: Adding to iommu group 14
[    0.449132] pci 0000:05:00.7: Adding to iommu group 14
[    0.449144] pci 0000:06:00.0: Adding to iommu group 6
[    0.451652] AMD-Vi: Extended features (0x4f77ef22294ada, 0x0): PPR NX GT IA GA PC GA_vAPIC
[    0.451673] AMD-Vi: Interrupt remapping enabled
[    0.451992] AMD-Vi: Virtual APIC enabled
[    0.452108] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    1.675719] AMD-Vi: Event logged [IO_PAGE_FAULT device=0000:00:13.1 domain=0x0000 address=0x9e51d200 flags=0x0050]
[    1.761818] AMD-Vi: Event logged [IO_PAGE_FAULT device=0000:00:13.1 domain=0x0000 address=0x9e51d200 flags=0x0050]
[    1.854046] AMD-Vi: Event logged [IO_PAGE_FAULT device=0000:00:13.1 domain=0x0000 address=0x9e51d200 flags=0x0050]
[    1.937752] nvme 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfedfc000 flags=0x0000]
[    1.937770] nvme 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000c address=0xfedfc080 flags=0x0000]
[    1.961211] AMD-Vi: Event logged [IO_PAGE_FAULT device=0000:00:13.1 domain=0x0000 address=0x9e51d200 flags=0x0050]

This is the relevant part of lspci -knn just after reboot, before trying to start the VM:
Code:
03:00.0 Network controller [0280]: Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] [17cb:1107] (rev 01)
        Subsystem: Quectel Wireless Solutions Co., Ltd. Device [1eac:8000]
        Kernel modules: ath12k
And this is the relevant part of the same command after trying and failing to start the VM:
Code:
03:00.0 Network controller [0280]: Qualcomm Technologies, Inc WCN785x Wi-Fi 7(802.11be) 320MHz 2x2 [FastConnect 7800] [17cb:1107] (rev 01)
        Subsystem: Quectel Wireless Solutions Co., Ltd. Device [1eac:8000]
        Kernel driver in use: vfio-pci
        Kernel modules: ath12k

Any idea on how to make the passthrough work?
 
I have tried to make it work by having vfio-pci grab control of the M2 wifi card before the host driver does, by adding vfio-pci.ids=xxxx:xxxx, to the boot, but does not work. In fact, now my pci ethernet is not working because for some reason now it is being taken by the host too instead of vfio, when before trying anything the passthrough of the pci ethernet was working fine. Reverting to the original configuration, has not fixed the pci ethernet.

I understand this is more a linux problem than a proxmox problem, so can anyone point to me how to debug this issue, or another forum where they look into these kind of things more in depth. Thanks.