[SOLVED] Mellanox NIC passthrough doesn't work regardless of guest OS (interrupts problem?)

cmrt

New Member
Feb 11, 2022
7
1
3
Hi all,

you're my last hope: I have a few HP-branded Mellanox NICs (2 ConnectX-2 and this ConnectX-3). For the past two weeks I tried to make passthrough work, to no avail. I mainly want this in pfSense, but it doesn't work at all.

The NICs are all recognized by the host OS, but as soon as I try to pass them through to a guest (any guest) I end up with kernel panics.

The system is a Ryzen 5700G, 64GB RAM running on an AsRock "Fatal1ty B450 Gaming-ITX/ac" (this one, not the newer K4).

I made sure to go through https://pve.proxmox.com/wiki/Pci_passthrough#PCI_EXPRESS_PASSTHROUGH and check:
- that IOMMU is enabled, both in BIOS and GRUB (amd_iommu=on and both with and without iommu=pt)
- that all the correct modules are loaded (they're both in /etc/modules and verified loaded via lsmod)
- that IOMMU Interrupt Remapping is present
Bash:
# dmesg | grep -i remapping
[    0.445373] AMD-Vi: Interrupt remapping enabled
- that IOMMU Isolation is on (see bottom of the post)
- that the card is detected properly (it works on the host, so...)

pfSense panics identically with both ConnectX-2 and -3, saying
Code:
mlx4_core0: command 0x23 timed out (go bit not cleared)
mlx4_core0: device is going to be reset
mlx4_core0: device was reset successfully
mlx4_core0: Failed to initialize queue pair table, aborting

Fatal trap 12: page fault while in kernel mode
[rest of the stack trace]

The same happens on a live FreeBSD as soon as I load the Mellanox mlx4en kernel module, Debian VMs give some similar errors that scroll by too fast for me to screenshot and in Windows 10, when I try to install the module, the setup hangs and Event Viewer > Security is full of mlx4_bus errors saying
Code:
Native_6_16_0: Lost interrupt was detected, inserting DPC to process EQE.
 EQE found on EQ index: 4
 Number of ETH EQs: 4

I've done an ungodly amount of both Google searches and trial and error for this (doesn't help that it's my main router, meaning I have no network and have to either tether or use my phone), but while I initially thought it was a FreeBSD/pfSense error, which led me down the rabbit hole of updating and reconfiguring the card firmware (spoiler: didn't work. Also I only did this on one card, the others are still "normal"), it doesn't seem to be a card problem.
In the meantime I also checked the cards firmware versions and they're all as recent as they can.

I've tried all combinations of settings in the passthrough menu (PCIe yes/no, all functions and not, ROM BAR on/off) but I still get the same error every time.
I tried clearing CMOS, disabling various settings on the cards themselves (via the system BIOS and the card's boot-time configurator), changing machine type.
In the BIOS I turned Resizable BAR on and off, SR-IOV on and off, PCIE ARI on and off, disabled XMP and probably other things I forgot.
I disabled "HP Shared Memory" in the BIOS for the NIC (didn't even know that this was a thing)

This email on kernel.org's mailing list suggested this could be due to Message Signaled Interrupts (MSI) but they're enabled in my guests (AFAICT) and I believe they are in pve-kernel too, or it wouldn't detect the card... Right?

This does seem to be some sort of problem with interrupts though (per Windows' error message), one that I've seen reported in these forums by another person (here), but with no solution.

Is there anyone with a similar setup that has had more luck in getting these cards to passthrough and can offer a solution? I'm happy to try new things that I haven't tried yet (but what I'm really hoping for is someone that says "oh yes I had this issue, it's this stupid flag here that you need to change").

Thanks!


IOMMU groups:
Bash:
# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/3/devices/0000:00:14.3
/sys/kernel/iommu_groups/3/devices/0000:00:14.0
/sys/kernel/iommu_groups/1/devices/0000:03:00.0
/sys/kernel/iommu_groups/1/devices/0000:09:00.0
/sys/kernel/iommu_groups/1/devices/0000:02:00.2
/sys/kernel/iommu_groups/1/devices/0000:02:00.0
/sys/kernel/iommu_groups/1/devices/0000:03:06.0
/sys/kernel/iommu_groups/1/devices/0000:08:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.1
/sys/kernel/iommu_groups/1/devices/0000:0a:00.0
/sys/kernel/iommu_groups/1/devices/0000:03:05.0
/sys/kernel/iommu_groups/1/devices/0000:02:00.1
/sys/kernel/iommu_groups/1/devices/0000:03:01.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.2
/sys/kernel/iommu_groups/1/devices/0000:03:04.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/1/devices/0000:03:07.0
/sys/kernel/iommu_groups/4/devices/0000:00:18.3
/sys/kernel/iommu_groups/4/devices/0000:00:18.1
/sys/kernel/iommu_groups/4/devices/0000:00:18.6
/sys/kernel/iommu_groups/4/devices/0000:00:18.4
/sys/kernel/iommu_groups/4/devices/0000:00:18.2
/sys/kernel/iommu_groups/4/devices/0000:00:18.0
/sys/kernel/iommu_groups/4/devices/0000:00:18.7
/sys/kernel/iommu_groups/4/devices/0000:00:18.5
/sys/kernel/iommu_groups/2/devices/0000:0c:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:08.0
/sys/kernel/iommu_groups/2/devices/0000:0b:00.2
/sys/kernel/iommu_groups/2/devices/0000:0b:00.0
/sys/kernel/iommu_groups/2/devices/0000:0c:00.1
/sys/kernel/iommu_groups/2/devices/0000:00:08.1
/sys/kernel/iommu_groups/2/devices/0000:0b:00.3
/sys/kernel/iommu_groups/2/devices/0000:0b:00.1
/sys/kernel/iommu_groups/2/devices/0000:0b:00.6
/sys/kernel/iommu_groups/2/devices/0000:00:08.2
/sys/kernel/iommu_groups/2/devices/0000:0b:00.4
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/0/devices/0000:01:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.1
 
Can you provide the PCI ID of the NIC as well?
Please also provide the output of pveversion -v.
 
Of course :)
Code:
# lspci -s 01:00.0
01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
# lspci -s 01:00.0 -n
01:00.0 0200: 15b3:1007
# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-10
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.5-1
proxmox-backup-file-restore: 2.2.5-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1

I don't have the ID for the ConnectX-2 cards, despite the problem being the same.

Also (might as well) here's an lspci -vv of the card
Code:
# lspci -s 01:00.0  -vv
01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
        Subsystem: Hewlett-Packard Company Ethernet 10G 2-port 546SFP+ Adapter
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 43
        IOMMU group: 0
        Region 0: Memory at fcd00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at 7ff2000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at fcc00000 [disabled] [size=1M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
                Product Name: HP Ethernet 10G 2-port 546SFP+ Adapter
                Read-only fields:
                        [PN] Part number: 779793-B21
                        [EC] Engineering changes: B-5703
                        [SN] Serial number: [REDACTED]
                        [V0] Vendor specific: PCIe 10GbE x8 6W
                        [V2] Vendor specific: 5703
                        [V4] Vendor specific: [REDACTED]
                        [V5] Vendor specific: 0B
                        [VA] Vendor specific: HP:V2=MFG:V3=FW_VER:V4=MAC:V5=PCAR
                        [VB] Vendor specific: HP ConnectX-3Pro SFP+
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                Read/write fields:
                        [V1] Vendor specific:
                        [YA] Asset tag: N/A
                        [V3] Vendor specific:
                        [RW] Read-write area: 241 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 252 byte(s) free
                End
        Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
                Vector table: BAR=0 offset=0007c000
                PBA: BAR=0 offset=0007d000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 116.000W
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend+
                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [148 v1] Device Serial Number [REDACTED]
        Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 0000007fd2000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [154 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [18c v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_core

Thank you!
 
So it's not alone in its IOMMU group:
Code:
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/0/devices/0000:01:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.1
From our docs [0]:
Code:
It is also important that the device(s) you want to pass through are in a separate IOMMU group.

This might be the reason for your issues. It seems your mainboard doesn't separate all PCI devices into separate groups.

You could try updating the BIOS of the mainboard, perhaps the situation improves. We had cases in the past where BIOS updates enabled or improved passthrough.
Sadly PCI(e) Passthrough is very dependent on the hardware, and not every consumer mainboard vendor provides the means for it.

Are you using machine type 'q35' for your VM?


[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_pci_passthrough
 
Are you using machine type 'q35' for your VM?

I am; I also created a new q35 machine for it, and tried Windows (which has always been on q35) to no avail

So it's not alone in its IOMMU group:
I'm not 100% sure this is strictly true, see:

Code:
# lspci | grep -E "00:01.0|01:00.0|00:01.1"
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

The "Dummy Host Bridge" is (obviously) a dummy, and there's a GPP Bridge in some other IOMMU groups too (see below)?

Maybe this is just a "display fluke"?

Also, would this explain why the card hangs when the driver loads?

And, finally, what if I pass all devices in the same IOMMU group to the same VM? Would that work?

You could try updating the BIOS of the mainboard, perhaps the situation improves. We had cases in the past where BIOS updates enabled or improved passthrough.

I'll check: I think there's one version I haven't updated to, but I'm not 100% sure, I may have already done that update.

Thanks!


Devices by IOMMU group
Code:
# for i in $(ls /sys/kernel/iommu_groups); do echo "IOMMU group ${i}"; for d in $(ls /sys/kernel/iommu_groups/$i/devices); do lspci | grep "$(echo $d | sed 's/^0000://')"; done; echo ""; done;
IOMMU group 0
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
01:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

IOMMU group 1
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller (rev 01)
02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller (rev 01)
02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge (rev 01)
03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
03:07.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port (rev 01)
08:00.0 Network controller: Intel Corporation Dual Band Wireless-AC 3168NGW [Stone Peak] (rev 10)
09:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
0a:00.0 Non-Volatile memory controller: Micron/Crucial Technology Device 540a (rev 01)

IOMMU group 2
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne (rev c8)
0b:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device 1637
0b:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
0b:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
0b:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
0b:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller
0c:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 81)
0c:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 81)

IOMMU group 3
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 51)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)

IOMMU group 4
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166a
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166b
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166c
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166d
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166e
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 166f
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1670
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1671
 
I talked to a colleague, and you're right. Those are not important and can be in the same IOMMU group without issues.

Based on your previous output, it seems a kernel driver is loaded for the NIC:
Code:
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_core
You could try blacklisting that one [0].

Additionally there are a few more things you could try:
* Update the firmware of the NIC to the latest version
* Try kernel 5.19 [1]

And one more thing my colleague hinted at, is that your NIC seems to support SR-IOV, which means you could pass functions to the VMs instead of the whole NIC. This might simplify everything, and allows you to use the NIC for more than one VM if needed.
Code:
Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 0000007fd2000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0



[0] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_pci_passthrough (10.9.2)
[1] https://forum.proxmox.com/threads/opt-in-linux-5-19-kernel-for-proxmox-ve-7-x-available.115090/
 
Hi Mira,

first of all THANK YOU and your colleague. Turns out the root was the module still being loaded.

I tried a lot of "blacklist mlx4_core (and all related modules), depmod -ae, update-initramfs -k all -u with no success (lsmod still showed them loaded), but in the end it turns out I had to "blacklist with fake install" as described in https://wiki.debian.org/KernelModuleBlacklisting.

This connection right now is going through a passed-through card :D

I now have to figure out a couple things in pfSense that are driving me up the wall, but that's out of scope for this.

I'm off to figuring out SR-IOV now (following 10.9.3 in the admin guide), or reconfiguring my entire internal network so all the VMs are on a different subnet - turns out trying to use bridges in pfSense is a Bad Idea (tm) (for performance reasons) so I need to come up with something :/

Thanks again!
 
  • Like
Reactions: vdarko
Great that you got it to work!
And thanks for the link, haven't seen that in a while.

You could probably also bind the devices to vfio-pci instead .
 
for any one that could not get their head around this create a file called /etc/modprobe.d/mlx4_core.conf

and add

Code:
install mlx4_core /bin/true

Reboot Proxmox,

and try to passthrough your Mellanox NIC through gui, if everything works as intended the VM should boot as the card has been passed through correctly
 
Hi, I have been trying to pass through a Mellanox ConnectX-3 Pro to a VM all day.
I have (IMHO) read every hint and tried many things, but no matter what I did - the following message kept coming up:
Code:
kvm: -device vfio-pci,host=0000:01:00.0,id=hostpci0,bus=pci.0,addr=0x10: vfio 0000:01:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Device or resource busy
TASK ERROR: start failed: QEMU exited with code 1
In case anyone else gets as far as this post, here's my solution.
It's very individual and certainly due to very cheap hardware. In this case, it's about a homelab server that should run mainly opnsense. That's why the incredibly cheap ConnectX-3 Pro runs on a Fujitsu Esprimo D738 E94+ (i3-8100). And that is part of the problem. Because the PCIe 16x bus shares an IRQ with the i801_smbus.
Code:
Aug 23 21:48:57 pve-esp1 kernel: genirq: Flags mismatch irq 16. 00000000 (vfio-intx(0000:01:00.0)) vs. 00000080 (i801_smbus)
So it was impossible to pass the card, because the "Device or resource busy" error (see above) occurred always.
I have deactivated the following drivers now:
Code:
cat /etc/modprobe.d/pve-blacklist.conf
# This file contains a list of modules which are not supported by Proxmox VE
# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb
#blacklist mlx4_core
blacklist i2c_i801
blacklist i2c_smbus
And right after that, what didn't work hours and hours before, worked.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!