PCI passthrough failure - kvm: vfio: Cannot reset device 0000:00:1f.6, no available reset mechanism.

cmonty14

Well-Known Member
Mar 4, 2014
343
5
58
Hello,
in my PVE host there are 2 NICs:
1 onboard Intel I219-LM
1 PCI Intel I350 quad port

I want to passthrough NIC Intel I219-LM, but when I start the relevant VM I get this error and the host reboots:
$ sudo qm start 100 kvm: vfio: Cannot reset device 0000:00:1f.6, no available reset mechanism.

My assumption is that this is related to the IOMMU groups, means multiple devices incl. NIC I219-LM belong to the same IOMMU group 10.
Code:
locadmin@maggie:~
$ find /sys/kernel/iommu_groups -type l | sort -t '/' -n -k 5
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.0
/sys/kernel/iommu_groups/3/devices/0000:00:12.0
/sys/kernel/iommu_groups/4/devices/0000:00:14.0
/sys/kernel/iommu_groups/4/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:16.0
/sys/kernel/iommu_groups/6/devices/0000:00:17.0
/sys/kernel/iommu_groups/7/devices/0000:00:1c.0
/sys/kernel/iommu_groups/8/devices/0000:00:1c.5
/sys/kernel/iommu_groups/9/devices/0000:00:1d.0
/sys/kernel/iommu_groups/10/devices/0000:00:1f.0
/sys/kernel/iommu_groups/10/devices/0000:00:1f.4
/sys/kernel/iommu_groups/10/devices/0000:00:1f.5
/sys/kernel/iommu_groups/10/devices/0000:00:1f.6
/sys/kernel/iommu_groups/11/devices/0000:03:00.0
/sys/kernel/iommu_groups/12/devices/0000:04:00.0
/sys/kernel/iommu_groups/13/devices/0000:04:00.1
/sys/kernel/iommu_groups/14/devices/0000:04:00.2
/sys/kernel/iommu_groups/15/devices/0000:04:00.3

locadmin@maggie:~
$ lspci
00:00.0 Host bridge: Intel Corporation 8th Gen Core 4-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S] (rev 08)
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 08)
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #1 (rev f0)
00:1c.5 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #6 (rev f0)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Cannon Point-LP LPC Controller (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-LM (rev 10)
03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
04:00.2 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
04:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

If I try to passthrough a port of the other NIC Intel I350 there are now issues when starting the VM.

Can you please advise how to setup NIC Intel I219-LM with a dedicated IOMMU group?

THX
 
It is not because of the group. You can use lspci -vv to see that it does not support Function Level Reset because it reports FLReset- (instead of FLReset+).

But maybe it still works fine with passthrough? I have seen USB controllers without reset work fine in a VM. Have you tried and were there any (other) issues or errors in logs? Maybe make sure the host does not touch the device by adding vfio-pci.ids=MANU:DEVI to your kernel parameters (or /etc/modprobe/ configuration files), where MANU:DEVI is the ID of the device that is shows when you use lspci -n.

If you need to move the device to another group, you need to put it in another PCIe slot but that is not possible with an onboard device. The motherboard BIOS determines the IOMMU groups and which devices are securely isolated. Two of the other devices in the group are bridges, so those are not a problem. The host will lose access tot the SMBus, but that might not be a problem.

Passthrough of one of the four ports of your I350 network device should work fine, as each port is in a separate IOMMU group. Any errors reported in journalctl? Did you use the pcie_acs_override to break groups (which invalidates the groups that you showed us)?
 
  • Like
Reactions: Thorvi
It is not because of the group. You can use lspci -vv to see that it does not support Function Level Reset because it reports FLReset- (instead of FLReset+).

But maybe it still works fine with passthrough? I have seen USB controllers without reset work fine in a VM. Have you tried and were there any (other) issues or errors in logs? Maybe make sure the host does not touch the device by adding vfio-pci.ids=MANU:DEVI to your kernel parameters (or /etc/modprobe/ configuration files), where MANU:DEVI is the ID of the device that is shows when you use lspci -n.

If you need to move the device to another group, you need to put it in another PCIe slot but that is not possible with an onboard device. The motherboard BIOS determines the IOMMU groups and which devices are securely isolated. Two of the other devices in the group are bridges, so those are not a problem. The host will lose access tot the SMBus, but that might not be a problem.

Passthrough of one of the four ports of your I350 network device should work fine, as each port is in a separate IOMMU group. Any errors reported in journalctl? Did you use the pcie_acs_override to break groups (which invalidates the groups that you showed us)?
There's no issue with NIC Intel I350 as stated in my initial posting.

Can you please advise how to proceed after creating a file in /etc/modprobe.d/vfio.conf and rebuilding the kernel?
If I try to start the VM, the server behaves like before, means error message + reboot.
I have no chance to check why the server reboots, but I assume there must be a severe issue with passthrough this NIC.
 
You don't need to rebuild the kernel, a update-initramfs -u should be enough.

The message about no reset is not the cause of the reboot. But since your systems responds by violently rebooting, I don't think passing through the onboard NIC will lead to success, even with overrides or binding to the vfio-pci driver early during boot. Maybe just use that NIC for management via the web GUI?
Better to pass one of the 4 sub-devices/ports of the I350, which are already in isolated IOMMU groups. Although I'm a bit unclear now on whether you do or don't have issues with passthrough of (parts of) the I350.
Or add virtual network bridge to connect a host network device to a VirtIO NIC of the VM, that will always work.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!