passthrough feature not work while linux kernel updated to 6.x

a632079

New Member
Feb 2, 2023
4
0
1
While I upgrade the kernel to 6.1, the passthrough feature (Maybe?) is broken, resulting in the vm can't start correctly.

The Full log: https://pastebin.ubuntu.com/p/h8NmBzFPKj/

According to the log, the `vfio-pci` reported that `vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible`, and make the vm start process waited for a long time, till `not ready 65535ms after FLR; giving up`.

[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1.6-1-pve root=/dev/mapper/pve-root ro debug ignore_loglevel intel_iommu=on iommu=qe
[ 0.040433] ACPI: DMAR 0x000000004341A000 000088 (v02 INTEL EDK2 00000002 01000013)
[ 0.040466] ACPI: Reserving DMAR table memory at [mem 0x4341a000-0x4341a087]
[ 0.061995] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1.6-1-pve root=/dev/mapper/pve-root ro debug ignore_loglevel intel_iommu=on iommu=qe
[ 0.062056] DMAR: IOMMU enabled
[ 0.106891] DMAR: Host address width 39
[ 0.106893] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.106899] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.106903] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.106909] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[ 0.106912] DMAR: RMRR base: 0x0000004c000000 end: 0x000000503fffff
[ 0.106916] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.106918] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.106920] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.108530] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.703905] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.742879] iommu: Default domain type: Translated
[ 0.742879] iommu: DMA domain TLB invalidation policy: lazy mode
[ 0.783570] DMAR: No ATSR found
[ 0.783571] DMAR: No SATC found
[ 0.783573] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.783574] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.783576] DMAR: IOMMU feature nwfs inconsistent
[ 0.783578] DMAR: IOMMU feature dit inconsistent
[ 0.783579] DMAR: IOMMU feature sc_support inconsistent
[ 0.783581] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.783583] DMAR: dmar0: Using Queued invalidation
[ 0.783588] DMAR: dmar1: Using Queued invalidation
[ 0.783715] pci 0000:00:02.0: Adding to iommu group 0
[ 0.784159] pci 0000:00:00.0: Adding to iommu group 1
[ 0.784170] pci 0000:00:04.0: Adding to iommu group 2
[ 0.784183] pci 0000:00:06.0: Adding to iommu group 3
[ 0.784196] pci 0000:00:0d.0: Adding to iommu group 4
[ 0.784211] pci 0000:00:14.0: Adding to iommu group 5
[ 0.784220] pci 0000:00:14.2: Adding to iommu group 5
[ 0.784233] pci 0000:00:16.0: Adding to iommu group 6
[ 0.784243] pci 0000:00:17.0: Adding to iommu group 7
[ 0.784258] pci 0000:00:1c.0: Adding to iommu group 8
[ 0.784274] pci 0000:00:1c.5: Adding to iommu group 9
[ 0.784288] pci 0000:00:1c.6: Adding to iommu group 10
[ 0.784304] pci 0000:00:1c.7: Adding to iommu group 11
[ 0.784327] pci 0000:00:1f.0: Adding to iommu group 12
[ 0.784336] pci 0000:00:1f.3: Adding to iommu group 12
[ 0.784346] pci 0000:00:1f.4: Adding to iommu group 12
[ 0.784356] pci 0000:00:1f.5: Adding to iommu group 12
[ 0.784368] pci 0000:01:00.0: Adding to iommu group 13
[ 0.784383] pci 0000:02:00.0: Adding to iommu group 14
[ 0.784398] pci 0000:03:00.0: Adding to iommu group 15
[ 0.784411] pci 0000:04:00.0: Adding to iommu group 16
[ 0.784427] pci 0000:05:00.0: Adding to iommu group 17
[ 0.785218] DMAR: Intel(R) Virtualization Technology for Directed I/O
root@Router-7505:~# dmesg | grep -i -e DMAR -e IOMMU
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1.6-1-pve root=/dev/mapper/pve-root ro debug ignore_loglevel intel_iommu=on iommu=qe
[ 0.040433] ACPI: DMAR 0x000000004341A000 000088 (v02 INTEL EDK2 00000002 01000013)
[ 0.040466] ACPI: Reserving DMAR table memory at [mem 0x4341a000-0x4341a087]
[ 0.061995] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1.6-1-pve root=/dev/mapper/pve-root ro debug ignore_loglevel intel_iommu=on iommu=qe
[ 0.062056] DMAR: IOMMU enabled
[ 0.106891] DMAR: Host address width 39
[ 0.106893] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.106899] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.106903] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.106909] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[ 0.106912] DMAR: RMRR base: 0x0000004c000000 end: 0x000000503fffff
[ 0.106916] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.106918] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.106920] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.108530] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.703905] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.742879] iommu: Default domain type: Translated
[ 0.742879] iommu: DMA domain TLB invalidation policy: lazy mode
[ 0.783570] DMAR: No ATSR found
[ 0.783571] DMAR: No SATC found
[ 0.783573] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.783574] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.783576] DMAR: IOMMU feature nwfs inconsistent
[ 0.783578] DMAR: IOMMU feature dit inconsistent
[ 0.783579] DMAR: IOMMU feature sc_support inconsistent
[ 0.783581] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.783583] DMAR: dmar0: Using Queued invalidation
[ 0.783588] DMAR: dmar1: Using Queued invalidation
[ 0.783715] pci 0000:00:02.0: Adding to iommu group 0
[ 0.784159] pci 0000:00:00.0: Adding to iommu group 1
[ 0.784170] pci 0000:00:04.0: Adding to iommu group 2
[ 0.784183] pci 0000:00:06.0: Adding to iommu group 3
[ 0.784196] pci 0000:00:0d.0: Adding to iommu group 4
[ 0.784211] pci 0000:00:14.0: Adding to iommu group 5
[ 0.784220] pci 0000:00:14.2: Adding to iommu group 5
[ 0.784233] pci 0000:00:16.0: Adding to iommu group 6
[ 0.784243] pci 0000:00:17.0: Adding to iommu group 7
[ 0.784258] pci 0000:00:1c.0: Adding to iommu group 8
[ 0.784274] pci 0000:00:1c.5: Adding to iommu group 9
[ 0.784288] pci 0000:00:1c.6: Adding to iommu group 10
[ 0.784304] pci 0000:00:1c.7: Adding to iommu group 11
[ 0.784327] pci 0000:00:1f.0: Adding to iommu group 12
[ 0.784336] pci 0000:00:1f.3: Adding to iommu group 12
[ 0.784346] pci 0000:00:1f.4: Adding to iommu group 12
[ 0.784356] pci 0000:00:1f.5: Adding to iommu group 12
[ 0.784368] pci 0000:01:00.0: Adding to iommu group 13
[ 0.784383] pci 0000:02:00.0: Adding to iommu group 14
[ 0.784398] pci 0000:03:00.0: Adding to iommu group 15
[ 0.784411] pci 0000:04:00.0: Adding to iommu group 16
[ 0.784427] pci 0000:05:00.0: Adding to iommu group 17
[ 0.785218] DMAR: Intel(R) Virtualization Technology for Directed I/O


Env:
PVE: pve-manager/7.3-4/d69b70d4
Kernel: 6.1-pve (opt-in kernel)
CPU: Intel(R) Pentium(R) Gold 7505
ethernet: intel I225-V * 4
drive: SanDisk Ultra 3D NvMe 1TB


5.15 full log: https://pastebin.ubuntu.com/p/s8qQ3jNtqv/
 
Check your IOMMU groups with for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done (without using pcie_acs_override) to see if those devices might interfere with each other.
 
Check your IOMMU groups with for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done (without using pcie_acs_override) to see if those devices might interfere with each other.
Thanks for your instruction!

Executed immediately after system boot
Code:
root@7505-Router:~# for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
IOMMU group 0 00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:9a78] (rev 01)
IOMMU group 10 00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a0be] (rev 20)
IOMMU group 11 00:1c.7 PCI bridge [0604]: Intel Corporation Tiger Lake-LP PCI Express Root Port #8 [8086:a0bf] (rev 20)
IOMMU group 12 00:1f.0 ISA bridge [0601]: Intel Corporation Tiger Lake-LP LPC Controller [8086:a082] (rev 20)
IOMMU group 12 00:1f.3 Audio device [0403]: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller [8086:a0c8] (rev 20)
IOMMU group 12 00:1f.4 SMBus [0c05]: Intel Corporation Tiger Lake-LP SMBus Controller [8086:a0a3] (rev 20)
IOMMU group 12 00:1f.5 Serial bus controller [0c80]: Intel Corporation Tiger Lake-LP SPI Controller [8086:a0a4] (rev 20)
IOMMU group 13 01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp Device [15b7:501a]
IOMMU group 14 02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 15 03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 16 04:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 17 05:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 1 00:00.0 Host bridge [0600]: Intel Corporation Device [8086:9a04] (rev 01)
IOMMU group 2 00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:9a03] (rev 01)
IOMMU group 3 00:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01)
IOMMU group 4 00:0d.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP Thunderbolt 4 USB Controller [8086:9a13] (rev 01)
IOMMU group 5 00:14.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller [8086:a0ed] (rev 20)
IOMMU group 5 00:14.2 RAM memory [0500]: Intel Corporation Tiger Lake-LP Shared SRAM [8086:a0ef] (rev 20)
IOMMU group 6 00:16.0 Communication controller [0780]: Intel Corporation Tiger Lake-LP Management Engine Interface [8086:a0e0] (rev 20)
IOMMU group 7 00:17.0 SATA controller [0106]: Intel Corporation Device [8086:a0d3] (rev 20)
IOMMU group 8 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a0bc] (rev 20)
IOMMU group 9 00:1c.5 PCI bridge [0604]: Intel Corporation Tigerlake PCH-LP PCI Express Root Port #6 [8086:a0bd] (rev 20)

After trying to start the vm:
Code:
root@7505-Router:~# for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
IOMMU group 0 00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:9a78] (rev 01)
IOMMU group 10 00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a0be] (rev 20)
IOMMU group 11 00:1c.7 PCI bridge [0604]: Intel Corporation Tiger Lake-LP PCI Express Root Port #8 [8086:a0bf] (rev 20)
IOMMU group 12 00:1f.0 ISA bridge [0601]: Intel Corporation Tiger Lake-LP LPC Controller [8086:a082] (rev 20)
IOMMU group 12 00:1f.3 Audio device [0403]: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller [8086:a0c8] (rev 20)
IOMMU group 12 00:1f.4 SMBus [0c05]: Intel Corporation Tiger Lake-LP SMBus Controller [8086:a0a3] (rev 20)
IOMMU group 12 00:1f.5 Serial bus controller [0c80]: Intel Corporation Tiger Lake-LP SPI Controller [8086:a0a4] (rev 20)
IOMMU group 13 01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp Device [15b7:501a]
IOMMU group 14 02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev ff)
IOMMU group 15 03:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 16 04:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 17 05:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 1 00:00.0 Host bridge [0600]: Intel Corporation Device [8086:9a04] (rev 01)
IOMMU group 2 00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:9a03] (rev 01)
IOMMU group 3 00:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01)
IOMMU group 4 00:0d.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP Thunderbolt 4 USB Controller [8086:9a13] (rev 01)
IOMMU group 5 00:14.0 USB controller [0c03]: Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller [8086:a0ed] (rev 20)
IOMMU group 5 00:14.2 RAM memory [0500]: Intel Corporation Tiger Lake-LP Shared SRAM [8086:a0ef] (rev 20)
IOMMU group 6 00:16.0 Communication controller [0780]: Intel Corporation Tiger Lake-LP Management Engine Interface [8086:a0e0] (rev 20)
IOMMU group 7 00:17.0 SATA controller [0106]: Intel Corporation Device [8086:a0d3] (rev 20)
IOMMU group 8 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a0bc] (rev 20)
IOMMU group 9 00:1c.5 PCI bridge [0604]: Intel Corporation Tigerlake PCH-LP PCI Express Root Port #6 [8086:a0bd] (rev 20)

I found this line: `IOMMU group 14 02:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev ff)`
rev code turned `ff` from `03` as the syslog show:
```
[ 130.633121] pcieport 0000:00:1c.0: Data Link Layer Link Active not set in 1000 msec
[ 130.633142] vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 130.694621] vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 131.417204] vfio-pci 0000:02:00.0: timed out waiting for pending transaction; performing function level reset anyway
[ 132.665330] vfio-pci 0000:02:00.0: not ready 1023ms after FLR; waiting
[ 133.721446] vfio-pci 0000:02:00.0: not ready 2047ms after FLR; waiting
[ 135.801609] vfio-pci 0000:02:00.0: not ready 4095ms after FLR; waiting
[ 140.153541] vfio-pci 0000:02:00.0: not ready 8191ms after FLR; waiting
[ 148.601429] vfio-pci 0000:02:00.0: not ready 16383ms after FLR; waiting
[ 167.033177] vfio-pci 0000:02:00.0: not ready 32767ms after FLR; waiting
[ 201.848664] vfio-pci 0000:02:00.0: not ready 65535ms after FLR; giving up
[ 202.874609] vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
[ 202.874912] igc 0000:03:00.0 enp3s0: PHC removed
[ 203.137251] igc 0000:04:00.0 enp4s0: PHC removed
```
 
What confuses me the most is that as long as I don't use eth0, passthrough from other ethernets works fine —— OpenWRT boots.
1675349023328.png

current vm hardware profile(can boot):
1675349327440.png

PS:
This is a clean build -- just updated kernel to v6.1; and set /etc/defaut/grub and /etc/kernel/cmdline with `iommu=on iommu=pt`; and set /etc/modules file correctly.

current Network infos:
1675349092424.png
System infos:

1675349255716.png
 
Must be som sort of kernel bug. I'm having the exact same issue on i5-1145G7. Maybe 11th gen cpu bug.
rev ff on the 0000:02 nic. Remove it from vm and it boots fine.

My solution for now is to stay on 5.15 where everything is ok. Would be nice with a more permanent solution though.

I've tried all current versions of kernel 6.2 with no success. 5.15.104-1 works as expected.

One thing to try is to swap interfaces. Let proxmox mangement be on eth0 and wan on eth3. However that would probably mess up my pfsense vm.
 
Last edited:
Hi. Any success?

I'm also trying to passthrough my Intel I225 to my OpenWRT VM. IMMOU is set up and I'm able to passthrough GPU, for example. However if I do the same to any ethernet ports, VM doesn't start and I got Unable to change power state from D3cold to D0, device inaccessible no journalctl -b
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!