ConnectX-3 with (Nvidia-)PCIe-Passthrough

RockNLol

Member
Sep 13, 2020
45
2
13
hi,
my PVE-6.3-6-server based on a Supermicro X11SCA-F and a Intel Xeon E-2246G passes its SATA-Controller through to one VM and a Nvidia GTX 1060 to another one. I'm trying to install a Mellanox ConnectX-3 without passthrough and create a 10gig-bridge for my VMs in Proxmox. Unfortunately it seems, that I somehow block the mellanox driver from loading, at least this is the output of lspci -nnv:
Code:
[...]

00:17.0 SATA controller [0106]: Intel Corporation Cannon Lake PCH SATA AHCI Controller [8086:a352] (rev 10) (prog-if 01 [AHCI 1.0])
        Subsystem: Super Micro Computer Inc Cannon Lake PCH SATA AHCI Controller [15d9:1a1d]
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 156
        Memory at 91234000 (32-bit, non-prefetchable) [size=8K]
        Memory at 9123b000 (32-bit, non-prefetchable) [size=256]
        I/O ports at 6050 [size=8]
        I/O ports at 6040 [size=4]
        I/O ports at 6020 [size=32]
        Memory at 9123a000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [70] Power Management version 3
        Capabilities: [a8] SATA HBA v1.0
        Kernel driver in use: vfio-pci
        Kernel modules: ahci

[...]

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:1c03] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] [10de:11d7]
        Flags: bus master, fast devsel, latency 0, IRQ 164
        Memory at 92000000 (32-bit, non-prefetchable) [size=16M]
        Memory at b0000000 (64-bit, prefetchable) [size=256M]
        Memory at c0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 5000 [size=128]
        Expansion ROM at 93000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [250] Latency Tolerance Reporting
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900] #19
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_current_drm, nvidia_current

01:00.1 Audio device [0403]: NVIDIA Corporation GP106 High Definition Audio Controller [10de:10f1] (rev a1)
        Subsystem: NVIDIA Corporation GP106 High Definition Audio Controller [10de:11d7]
        Flags: bus master, fast devsel, latency 0, IRQ 17
        Memory at 93080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

02:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003]
        Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0055]
        Flags: fast devsel, IRQ 17
        Memory at 93300000 (64-bit, non-prefetchable) [disabled] [size=1M]
        Memory at c2800000 (64-bit, prefetchable) [disabled] [size=8M]
        Expansion ROM at 93200000 [disabled] [size=1M]
        Capabilities: [40] Power Management version 3
        Capabilities: [48] Vital Product Data
        Capabilities: [9c] MSI-X: Enable- Count=128 Masked-
        Capabilities: [60] Express Endpoint, MSI 00
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [148] Device Serial Number f4-52-14-03-00-84-14-20
        Capabilities: [154] Advanced Error Reporting
        Capabilities: [18c] #19
        Kernel driver in use: vfio-pci
        Kernel modules: mlx4_core
      
[...]

As you can see, the "Kernel driver in use:" is vfio-pci.

my /etc/modprobe.d/pve-blacklist.conf-file looks like this:
Code:
# This file contains a list of modules which are not supported by Proxmox VE

# nidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701

## NVIDIA
blacklist nvidia
blacklist nouveau

## INTEL
blacklist snd_hda_intel
#blacklist snd_hda_codec_hdmi
#blacklist i915

other files in /etc/modprobe.d/ do not have any blacklist entries.

So it seems blacklisting nvidia blocks the mellanox driver from loading? Can somebody confirm this? Or do I have to install different drivers?

*edit: I found at least part of the issue. Since the Nvidia graphicscard and the Mellanox NIC are on two neighbouring PCIe-16x-slots that split bandwith to 2 8x lanes they are thrown into the same immu group 1.
Right after boot "Kernel driver in use: " is mlx4_core, but as soon as the VM with the NVIDIA-card starts, it changes to vfio-pci.

Thats gonna be tricky...
 
Last edited:
Yes, I did get it to work, but I don't remember, what did the trick in the end. I do remember updating the BIOS.
By now I do not have the NVIDIA GPU installed anymore, so I'm not a big help unfortunately.
 
Yes, I did get it to work, but I don't remember, what did the trick in the end. I do remember updating the BIOS.
By now I do not have the NVIDIA GPU installed anymore, so I'm not a big help unfortunately.
Okay, thank you though. You helped me narrowing down my issue.
 
thinking about it a bit more, I think I also played around with pcie bifurcation settings in the bios.
 
thinking about it a bit more, I think I also played around with pcie bifurcation settings in the bios.
Don't have any of those settings in my bios. I got it working by putting one of the cards (that was passed through) in a different pcie slot resulting in a different IOMMU group.
 
  • Like
Reactions: RockNLol

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!