[SOLVED] Intel X710 PCIe passthrough issues - success stories (Linux/BSD)?

Jul 11, 2023
25
11
8
Hi,

i'm running the latest Proxmox with all updates in a cluster with 3 machines based on Supermicro/AMD Epyc.

Regarding my problem:
I'm passing a X710 (10Gbit) PCIe nic together with an Intel i350 (1Gbit) to an OpenBSD system acting as main
router. I don't have any issues with the i350, but the X710 always stops working properly after few hours. This
can reach from 1,5h - 9h...never more. The X710 is running the latest firmware from August 2023.

Without going further into details of OpenBSD, i'd like to know if anyone is successfully using this card in a
passthrough environment on Linux/BSD systems?

I'm also asking because i see the following "vfio-pci" message when the OpenBSD VM is started and i don't
know if this is perhaps directly the issue:

Code:
[  156.248518] vfio-pci 0000:01:00.0: Masking broken INTx support                           
[  156.368076] vfio-pci 0000:01:00.1: Masking broken INTx support                                                                                                                         
[  158.176749] vfio-pci 0000:01:00.0: vfio_bar_restore: reset recovery - restoring BARs     
[  158.432234] vfio-pci 0000:01:00.1: vfio_bar_restore: reset recovery - restoring BARs

"Masking broken INTx support" should be harmless according to kernel.org, regarding the "vfio_bar_restore"
you find almost nothing - especially regarding nics, but mostly graphic cards.

As i've already tried all kind of BIOS options (f.e. Above 4G Decoding on/off, IOMMU, ACS, AER...), switched the
cards on the mainboard, early bound the adapter to vfio-pci instead of i40e via initramfs...and everything without
resolving the problem, i'm search for opinions and success stories of people using this card.

Perhaps there are also special kernel options needed on Proxmox to prevent the card from going into S3 mode, or
sth like that.

Thanks, Mark
 
As i got a private mail about how the issue has been fix (and if at all), i'd like to write up on this and close the thread as "solved".

Researching to that issue showed, that there has been a wrong setting in the OpenBSD kernel for the number of segments the kernel can make in a single transaction...more precisely in src/sys/dev/pci/if_ixl.c.

Long story short, by defining "IXL_TX_PKT_DESCS 8" (instead of 32) in commit r1.94 problems were solved. Since then (Dec 2023) everything runs fine.
 
  • Like
Reactions: lebowski89