[SOLVED] Not working: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

nidomiro

New Member
Sep 23, 2023
14
7
3
Hi I bought a "Hewlett-Packard Company Ethernet 10G 2-port 546SFP+ Adapter" and expected to use it for 10Gbe.
The card is visible in `lspci` but the interfaces are not there.
During startup the card seems to work, since the uplink lights on the switch light up.
But then the message `mlx4_en 0000:02:00.0: removed PHC` appears on the screen, and the link is down.

I already found this post, but the suggested fixes did not work: https://forum.proxmox.com/threads/mellanox-connectx-3-pro-mt27520-not-showing-under-pve-8.133505/

What I tried so far:
  • adding `pci=realloc=off` to `GRUB_CMDLINE_LINUX_DEFAULT` in `/etc/default/grub`
  • Finding `SR-IOV` in the Bios, but is is not there
The Host System is a Fujitsu Primergy TX1320 M3 with a Intel Xeon E3-1230 v6 (and 32 GB ECC RAM)

Is there anything more I can try?

Bash:
02:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
        Subsystem: Hewlett-Packard Company Ethernet 10G 2-port 546SFP+ Adapter
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 17
        IOMMU group: 1
        Region 0: Memory at dfa00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at 90000000 (64-bit, prefetchable) [size=32M]
        Expansion ROM at df900000 [disabled] [size=1M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D3 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
                Product Name: HP Ethernet 10G 2-port 546SFP+ Adapter
                Read-only fields:
                        [PN] Part number: 779793-B21
                        [EC] Engineering changes: B-5710
                        [SN] Serial number: IL270901PM
                        [V0] Vendor specific: PCIe 10GbE x8 6W
                        [V2] Vendor specific: 5710
                        [V4] Vendor specific: 9CDC71516680
                        [V5] Vendor specific: 0B
                        [VA] Vendor specific: HP:V2=MFG:V3=FW_VER:V4=MAC:V5=PCAR
                        [VB] Vendor specific: HP ConnectX-3Pro SFP+
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                Read/write fields:
                        [V1] Vendor specific:         
                        [YA] Asset tag: N/A                   
                        [V3] Vendor specific:                       
                        [RW] Read-write area: 241 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 255 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 252 byte(s) free
                End
        Capabilities: [9c] MSI-X: Enable- Count=128 Masked-
                Vector table: BAR=0 offset=0007c000
                PBA: BAR=0 offset=0007d000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 116W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x8
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [c0] Vendor Specific Information: Len=18 <?>
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [148 v1] Device Serial Number 9c-dc-71-03-00-51-66-80
        Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration- 10BitTagReq- Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy- 10BitTagReq-
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 0000000092000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [154 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [18c v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Kernel driver in use: vfio-pci
        Kernel modules: mlx4_core

Bash:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    altname enp4s0
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    altname enp5s0
6: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
7: tap92111i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr92111i0 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
8: fwbr92111i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xxc brd ff:ff:ff:ff:ff:ff
9: fwpr92111p0@fwln92111i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
10: fwln92111i0@fwpr92111p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr92111i0 state UP mode DEFAULT group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff

Bash:
[    1.069435] mlx4_core: Mellanox ConnectX core driver v4.0-0
[    1.070077] mlx4_core: Initializing 0000:02:00.0
[    1.070726] mlx4_core 0000:02:00.0: enabling device (0140 -> 0142)
[    7.286052] mlx4_core 0000:02:00.0: DMFS high rate steer mode is: disabled performance optimized steering
[    7.287190] mlx4_core 0000:02:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[    7.358428] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
[    7.360123] mlx4_en 0000:02:00.0: Activating port:1
[    7.362979] mlx4_en: 0000:02:00.0: Port 1: Using 8 TX rings
[    7.363702] mlx4_en: 0000:02:00.0: Port 1: Using 8 RX rings
[    7.364698] mlx4_en: 0000:02:00.0: Port 1: Initializing port
[    7.366853] mlx4_en 0000:02:00.0: registered PHC clock
[    7.367791] mlx4_en 0000:02:00.0: Activating port:2
[    7.368994] mlx4_en: 0000:02:00.0: Port 2: Using 8 TX rings
[    7.369591] mlx4_en: 0000:02:00.0: Port 2: Using 8 RX rings
[    7.386320] <mlx4_ib> mlx4_ib_probe: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
[    7.391173] mlx4_en: 0000:02:00.0: Port 2: Initializing port
[    7.391996] <mlx4_ib> mlx4_ib_probe: counter index 2 for port 1 allocated 1
[    7.392447] <mlx4_ib> mlx4_ib_probe: counter index 3 for port 2 allocated 1
[    7.412872] mlx4_core 0000:02:00.0 enp2s0d1: renamed from eth1
[    7.424802] mlx4_core 0000:02:00.0 enp2s0: renamed from eth0
[    9.386278] mlx4_en: enp2s0: Link Up
[   10.125613] mlx4_en: enp2s0: Link Up
[   26.616736] mlx4_en 0000:02:00.0: removed PHC
 
Last edited:
U can download driver from mellanox,but u have to use an older pve Version.
 
Kernel driver in use: vfio-pci
Kernel modules: mlx4_core
The vfio-pci reference there makes it look like you're trying to pass the card to a VM for it to use directly, rather than use the card on the host.

Is that what you're wanting to do?
 
Last edited:
I just put the card in. In want to use it on the Host.
But I have a TrueNas VM with a HBA passed through to it.
Can this explain the vfio-pci?
 
Hmmm, would you be ok to run lspci -nnk and paste the complete output here?

That should help show if the HBA card and this network card are somehow attached to the same PCIe bus.



Also, a list of the files in the /etc/modprobe.d/ directory would be helpful too, just to see if there's something in there which might be passing the network card details to vfio. :)



How are you passing the HBA card to your TrueNAS VM? I'm looking through the Proxmox PCIe passthrough page, and it shouldn't be too hard to figure this problem out and get the network card operating properly. :)
 
Last edited:
Thanks for your help so far :)

Here is the info:
Bash:
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5918] (rev 05)
        Subsystem: Fujitsu Technology Solutions Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [1734:1221]
        Kernel driver in use: skl_uncore
        Kernel modules: ie31200_edac
00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05)
        DeviceName: Slot 4
        Subsystem: Fujitsu Technology Solutions 6th-10th Gen Core Processor PCIe Controller (x16) [1734:1221]
        Kernel driver in use: pcieport
00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 05)
        DeviceName: Slot 3
        Subsystem: Fujitsu Technology Solutions Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [1734:1221]
        Kernel driver in use: pcieport
00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [1734:1222]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family Thermal Subsystem [1734:1222]
        Kernel driver in use: intel_pch_thermal
        Kernel modules: intel_pch_thermal
00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family MEI Controller [1734:1222]
        Kernel modules: mei_me
00:16.1 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #2 [8086:a13b] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family MEI Controller [1734:1222]
        Kernel modules: mei_me
00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31)
        Subsystem: Fujitsu Technology Solutions Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [1734:1222]
        Kernel driver in use: ahci
        Kernel modules: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #5 [8086:a114] (rev f1)
        DeviceName: Onboard PCH Unknown
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family PCI Express Root Port [1734:1222]
        Kernel driver in use: pcieport
00:1c.5 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #6 [8086:a115] (rev f1)
        DeviceName: Onboard LAN 1
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family PCI Express Root Port [1734:1222]
        Kernel driver in use: pcieport
00:1c.6 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 [8086:a116] (rev f1)
        DeviceName: Onboard LAN 2
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family PCI Express Root Port [1734:1222]
        Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation C236 Chipset LPC/eSPI Controller [8086:a149] (rev 31)
        Subsystem: Fujitsu Technology Solutions C236 Chipset LPC/eSPI Controller [1734:1222]
00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family Power Management Controller [1734:1222]
00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
        Subsystem: Fujitsu Technology Solutions 100 Series/C230 Series Chipset Family SMBus [1734:1222]
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801
01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:0097] (rev 02)
        Subsystem: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:000c]
        Kernel driver in use: vfio-pci
        Kernel modules: mpt3sas
02:00.0 Ethernet controller [0200]: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] [15b3:1007]
        Subsystem: Hewlett-Packard Company Ethernet 10G 2-port 546SFP+ Adapter [103c:801f]
        Kernel driver in use: vfio-pci
        Kernel modules: mlx4_core
03:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) [102b:0522] (rev 05)
        DeviceName: VGA iRMC4
        Subsystem: Fujitsu Technology Solutions MGA G200e [Pilot] ServerEngines (SEP1) [1734:11cc]
        Kernel driver in use: mgag200
        Kernel modules: mgag200
03:00.1 Co-processor [0b40]: Emulex Corporation ServerView iRMC HTI [19a2:0800]
        Subsystem: Fujitsu Technology Solutions ServerView iRMC HTI [1734:11cc]
04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
        DeviceName: LAN i210-AT
        Subsystem: Fujitsu Technology Solutions I210 Gigabit Network Connection [1734:11f1]
        Kernel driver in use: igb
        Kernel modules: igb
05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
        DeviceName: LAN i210-AT
        Subsystem: Fujitsu Technology Solutions I210 Gigabit Network Connection [1734:11f1]
        Kernel driver in use: igb
        Kernel modules: igb

Bash:
$ ll /etc/modprobe.d/
total 4.5K
-rw-r--r-- 1 root root 172 Jun 21  2023 pve-blacklist.conf

$ cat /etc/modprobe.d/*
# This file contains a list of modules which are not supported by Proxmox VE

# nvidiafb see bugreport https://bugzilla.proxmox.com/show_bug.cgi?id=701
blacklist nvidiafb

What I did to get the hba to the VM:

1. Inside `/etc/default/grub` edit to: `GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"`
2. Inside `/etc/modules` add the lines:
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
3. `update-initramfs -u -k all` and reboot
4. In the Proxmox UI, add the Raw Device `0000:01:00.0` to the VM with `All Functions` and `ROM-Bar` ticked. Does it also have a benefit to tick the `PCI-Express` checkbox?


The thing I want to achieve is that the Host has a 10Gbe connection and the Virtualized TrueNas has the HBA with the physical HDDs and a virtual Network Connection (Proxmox Default).

Edit:
I found out (via `pvesh get /nodes/$(hostname)/hardware/pci --pci-class-blacklist ""`) that the HBA and the 10Gbe card ar on the same iommu group. That is really sad, since these are the only 8x slots. All other slots are smaller.

While searching for a solution I found this: https://www.reddit.com/r/Proxmox/co...=web3xcss&utm_term=1&utm_content=share_button
Adding `pcie_acs_override=downstream` or `pcie_acs_override=downstream,multifunction` to `GRUB_CMDLINE_LINUX_DEFAULT` results in a broken system (has network, but cannot reach the cluster). Also it does not solve the initial issue, that as soon as I start the VM the 10Gbe card goes offline.
 
Last edited:
  • Like
Reactions: justinclift
Looks like we're on the right track here. The HBA is this one:
01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:0097] (rev 02)
Subsystem: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:000c]
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas

With the Mellanox card right below it in the list:
02:00.0 Ethernet controller [0200]: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] [15b3:1007]
Subsystem: Hewlett-Packard Company Ethernet 10G 2-port 546SFP+ Adapter [103c:801f]
Kernel driver in use: vfio-pci
Kernel modules: mlx4_core
So yeah, the problem seems to be both of those cards being in the same iommu group, so you either get both or none of them passed through. :eek:

That is really sad, since these are the only 8x slots. All other slots are smaller.

That might not be the end of the story though. :)

10GbE isn't actually much in the way of PCIe bandwidth. A single channel of PCIe version 2 can push about 500MB/s:

https://en.wikipedia.org/wiki/PCI_Express#Comparison_table

So, to push about 1000MB/s one way just needs double that (2 PCIe lanes of version 2). 10GbE is a full duplex protocol though, so it can push ~1000MB/s in both directions at the same time (1000MB/s receive, and 1000MB/s send), so that means it'd need 4 lanes of PCIe version 2 in order to fully utilise a card using it.

For your specific motherboard, are any of the other PCIe slots a x4? Because that'd probably be completely fine for that card anyway. In theory. :)
 
Last edited:
  • Like
Reactions: nidomiro
Yes, I already swapped it to a x4 slot and now it is kind of working.
The Link is up and adding it to my bond (active-backup) did work.
However I get errors when reloading the config with `ifreload -a`.

Code:
error: bond0: failed to set vid `{127, 128, [...], 4093, 4094}` (cmd '/sbin/bridge -force -batch - [vlan add vid 127-4094 dev bond0 ]' failed: returned 1 (RTNETLINK answers: No space left on device
Command failed -:1
))

There is plenty of disk space available.
 
Ahhh, that new error is a different thing. Actually came across the info for that earlier on today by complete co-incidence.

It turns out Mellanox cards of the ConnectX-3 generation have a maximum number of vlan's they can support (in hardware I think). 128 maybe?

I'll have to go find the info again. Shouldn't be too hard to find though, it was here in the Proxmox forums I think.
 
Last edited:
  • Like
Reactions: nidomiro
Thank you very much. It works now :)

As I'm using far less than 128 VLans I just mentioned them explicitly.

Without your help I would have been lost (and very frustrated ;) )

Now I have to think about, if I switch the positions of the HBA and the 10Gbe card. I just checked, and both slots and cards are PCIe v3
 
Last edited:
  • Like
Reactions: justinclift
Thank you very much. It works now :)

As I'm using far less than 128 VLans I just mentioned them explicitly.

Without your help I would have been lost (and very frustrated ;) )

Now I have to think about, if I switch the positions of the HBA and the 10Gbe card. I just checked, and both slots and cards are PCIe v3
As far i remember, all mellanox connect-x3 and 4 Cards from HPe have a vlan bug.
So vmbr with slave port/bond of connect-x3/4, will never be vlan aware. At least it won't work. Unless you put the slave ports into Promisc-Mode.

I had this on a lot of HPe Mellanox Cards, that why im avoiding since then HPe branded Cards xD
Same Firmware as on Nvidia Original, just original nvidia doesn't need Promic, Hpe needs xD
(Not same Firmware, but same version of the FW, just from HPe vs Mellanox Orig)

Check if your vlans work, if they do without promisc, you are in Luck.
Cheers
 
Last edited:
  • Like
Reactions: nidomiro

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!