Hi,
I'm trying to get SR-IOV working on Proxmox (followed the guide for enabling SR-IOV: PCI(e) Passthrough - Proxmox VE), but always end up with all virtual functions (configured directly in the firmware of the ConnectX-3 card AND in /etc/modprobe.d/) stuck in the same IOMMU group. Switching PCIe slots didn't help either.
Trying to start a VM with one of the VFs attached fails, resulting in these entries in dmesg:
Overall it's quite similar to: https://forum.proxmox.com/threads/pve-5-0-beta-2-mellanox-connectx-3-and-sr-iov.35103/
My setup
I'm trying to get SR-IOV working on Proxmox (followed the guide for enabling SR-IOV: PCI(e) Passthrough - Proxmox VE), but always end up with all virtual functions (configured directly in the firmware of the ConnectX-3 card AND in /etc/modprobe.d/) stuck in the same IOMMU group. Switching PCIe slots didn't help either.
Code:
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[ 3.460271] pci 0000:c0:00.2: AMD-Vi: IOMMU performance counters supported
[ 3.460326] pci 0000:80:00.2: AMD-Vi: IOMMU performance counters supported
[ 3.460361] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[ 3.460384] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 3.555920] pci 0000:c0:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 3.555922] pci 0000:c0:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[ 3.555927] pci 0000:80:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 3.555929] pci 0000:80:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[ 3.555933] pci 0000:40:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 3.555934] pci 0000:40:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[ 3.555938] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[ 3.555939] pci 0000:00:00.2: AMD-Vi: Extended features (0x58f77ef22294ade):
[ 3.555943] AMD-Vi: Interrupt remapping enabled
[ 3.555944] AMD-Vi: Virtual APIC enabled
[ 3.555945] AMD-Vi: X2APIC enabled
[ 3.556605] AMD-Vi: Lazy IO/TLB flushing enabled
[ 3.561685] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 3.561765] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[ 3.561850] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[ 3.561932] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
Code:
mlxconfig -d /dev/mst/mt4099_pci_cr0 q
Device #1:
----------
Device type: ConnectX3
Device: /dev/mst/mt4099_pci_cr0
Configurations: Next Boot
SRIOV_EN True(1)
NUM_OF_VFS 24
LINK_TYPE_P1 ETH(2)
LINK_TYPE_P2 ETH(2)
LOG_BAR_SIZE 3
BOOT_PKEY_P1 0
BOOT_PKEY_P2 0
BOOT_OPTION_ROM_EN_P1 False(0)
BOOT_VLAN_EN_P1 False(0)
BOOT_RETRY_CNT_P1 0
LEGACY_BOOT_PROTOCOL_P1 None(0)
BOOT_VLAN_P1 1
BOOT_OPTION_ROM_EN_P2 False(0)
BOOT_VLAN_EN_P2 False(0)
BOOT_RETRY_CNT_P2 0
LEGACY_BOOT_PROTOCOL_P2 None(0)
BOOT_VLAN_P2 1
IP_VER_P1 IPv4(0)
IP_VER_P2 IPv4(0)
CQ_TIMESTAMP True(1)
Code:
41:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003]
Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0050]
Flags: bus master, fast devsel, latency 0, IRQ 122, NUMA node 0
Memory at b0400000 (64-bit, non-prefetchable) [size=1M]
Memory at 2807f800000 (64-bit, prefetchable) [size=8M]
Expansion ROM at b0300000 [disabled] [size=1M]
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
Capabilities: [148] Device Serial Number 00-02-c9-03-00-40-c8-f0
Capabilities: [154] Advanced Error Reporting
Capabilities: [18c] #19
Capabilities: [108] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: mlx4_core
Kernel modules: mlx4_core
41:00.1 Ethernet controller [0200]: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] [15b3:1004]
Subsystem: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] [15b3:61b0]
Flags: fast devsel, NUMA node 0
[virtual] Memory at 28073800000 (64-bit, prefetchable) [size=8M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [9c] MSI-X: Enable- Count=36 Masked-
Capabilities: [40] Power Management version 0
Kernel modules: mlx4_core
Trying to start a VM with one of the VFs attached fails, resulting in these entries in dmesg:
Code:
mlx4_en 0000:41:00.0: removed PHC
mlx4_core 0000:41:00.0: Disabling SR-IOV
pci 0000:41:00.1: Removing from iommu group 48
pci 0000:41:00.2: Removing from iommu group 48
pci 0000:41:00.3: Removing from iommu group 48
pci 0000:41:00.4: Removing from iommu group 48
pci 0000:41:00.5: Removing from iommu group 48
pci 0000:41:00.6: Removing from iommu group 48
pci 0000:41:00.7: Removing from iommu group 48
pci 0000:41:01.0: Removing from iommu group 48
pci 0000:41:01.1: Removing from iommu group 48
pci 0000:41:01.2: Removing from iommu group 48
pci 0000:41:01.3: Removing from iommu group 48
pci 0000:41:01.4: Removing from iommu group 48
pci 0000:41:01.5: Removing from iommu group 48
pci 0000:41:01.6: Removing from iommu group 48
pci 0000:41:01.7: Removing from iommu group 48
pci 0000:41:02.0: Removing from iommu group 48
pci 0000:41:02.1: Removing from iommu group 48
pci 0000:41:02.2: Removing from iommu group 48
pci 0000:41:02.3: Removing from iommu group 48
pci 0000:41:02.4: Removing from iommu group 48
pci 0000:41:02.5: Removing from iommu group 48
pci 0000:41:02.6: Removing from iommu group 48
pci 0000:41:02.7: Removing from iommu group 48
pci 0000:41:03.0: Removing from iommu group 48
Overall it's quite similar to: https://forum.proxmox.com/threads/pve-5-0-beta-2-mellanox-connectx-3-and-sr-iov.35103/
My setup
- Proxmox 6.1-5 (Kernel: 5.3.13-2)
- Supermicro H11SSL
- AMD Epyc 7502
- Mellanox ConnectX-3