When updating my PVE-host to a kernel newer than 6.8.12-13-pve, PCIe passtrough of a M.2 Google Coral TPU fails as the VM will not start anymore with the following error message:
Reverting to kernel 6.8.12-13-pve solves the issue but as this kernel is not part of the PVE Debian Trixie sources and had to be manaully installed, I do not think that this is a long term solution.
The device is bound to the
When the start of the VM fails, I get the following errors in dmesg:
I suspect that the error about changing the power state is the root cause but I was not able to find a solution to the problem.
Any other passed through PCIe devices (HBA, NIC and GPU) continue to work as expected.
Code:
kvm: -device vfio-pci,host=0000:00:02.0,id=hostpci1,bus=pci.0,addr=0x11:
info: OpRegion detected on Intel display 4680.
kvm: -device vfio-pci,host=0000:0b:00.0,id=hostpci2,bus=pci.0,addr=0x1b:
vfio 0000:0b:00.0: error getting device from group 27: No such device
Verify all devices in group 27 are bound to vfio-<bus> or pci-stub and not already in useTASK ERROR: start failed: QEMU exited with code 1
The device is bound to the
vfio-pci driver and is the only device in its IOMMU group.
Code:
root@pve:~# lspci -k -s 0000:0b:00.0
0b:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU
Subsystem: Global Unichip Corp. Coral Edge TPU
Kernel driver in use: vfio-pci
root@pve:~# ls /sys/kernel/iommu_groups/27/devices/
0000:0b:00.0
Code:
root@pve:~# dmesg | grep -i vfio | tail -20 && dmesg | grep -i "0b:00" | tail -20
[ 292.193327] vfio-pci 0000:00:02.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
[ 292.217930] vfio-pci 0000:0b:00.0: enabling device (0000 -> 0002)
[ 292.217976] vfio-pci 0000:0b:00.0: resetting
[ 292.359924] vfio-pci 0000:0b:00.0: reset done
[ 385.470037] vfio-pci 0000:01:00.0: resetting
[ 385.575009] vfio-pci 0000:01:00.0: reset done
[ 385.587580] vfio-pci 0000:00:02.0: resetting
[ 385.694988] vfio-pci 0000:00:02.0: reset done
[ 385.719237] vfio-pci 0000:0b:00.0: resetting
[ 385.923005] vfio-pci 0000:0b:00.0: reset done
[ 385.923063] vfio-pci 0000:0b:00.0: Unable to change power state from D0 to D3hot, device inaccessible
[ 387.193323] vfio-pci 0000:01:00.0: resetting
[ 387.294999] vfio-pci 0000:01:00.0: reset done
[ 387.295042] vfio-pci 0000:01:00.0: Masking broken INTx support
[ 387.383824] vfio-pci 0000:00:02.0: resetting
[ 387.486950] vfio-pci 0000:00:02.0: reset done
[ 387.487459] vfio-pci 0000:00:02.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
[ 387.512402] vfio-pci 0000:0b:00.0: enabling device (0000 -> 0002)
[ 387.512453] vfio-pci 0000:0b:00.0: resetting
[ 387.655064] vfio-pci 0000:0b:00.0: reset done
[ 292.489814] pci 0000:0b:00.0: BAR 2 [mem 0x6003a00000-0x6003afffff 64bit pref]: assigned
[ 292.489841] pci 0000:0b:00.0: BAR 0 [mem 0x6003b00000-0x6003b03fff 64bit pref]: assigned
[ 385.719237] vfio-pci 0000:0b:00.0: resetting
[ 385.923005] vfio-pci 0000:0b:00.0: reset done
[ 385.923063] vfio-pci 0000:0b:00.0: Unable to change power state from D0 to D3hot, device inaccessible
[ 386.047944] pci 0000:0b:00.0: [1ac1:089a] type 00 class 0x0000ff PCIe Endpoint
[ 386.048028] pci 0000:0b:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit pref]
[ 386.048034] pci 0000:0b:00.0: BAR 2 [mem 0x00000000-0x000fffff 64bit pref]
[ 386.048472] pci 0000:0b:00.0: Adding to iommu group 27
[ 386.049898] pci 0000:0b:00.0: BAR 2 [mem 0x6003a00000-0x6003afffff 64bit pref]: assigned
[ 386.049927] pci 0000:0b:00.0: BAR 0 [mem 0x6003b00000-0x6003b03fff 64bit pref]: assigned
[ 387.512402] vfio-pci 0000:0b:00.0: enabling device (0000 -> 0002)
[ 387.512453] vfio-pci 0000:0b:00.0: resetting
[ 387.655064] vfio-pci 0000:0b:00.0: reset done
[ 387.782980] pci 0000:0b:00.0: [1ac1:089a] type 00 class 0x0000ff PCIe Endpoint
[ 387.783049] pci 0000:0b:00.0: BAR 0 [mem 0x6003b00000-0x6003b03fff 64bit pref]
[ 387.783055] pci 0000:0b:00.0: BAR 2 [mem 0x6003a00000-0x6003afffff 64bit pref]
[ 387.783442] pci 0000:0b:00.0: Adding to iommu group 27
[ 387.784910] pci 0000:0b:00.0: BAR 2 [mem 0x6003a00000-0x6003afffff 64bit pref]: assigned
[ 387.784943] pci 0000:0b:00.0: BAR 0 [mem 0x6003b00000-0x6003b03fff 64bit pref]: assigned
root@pve:~#
Any other passed through PCIe devices (HBA, NIC and GPU) continue to work as expected.