[SOLVED] Mit IOMMMU keine Verbindung mehr zu Proxmox

kekzmobile

Member
Mar 30, 2020
11
1
23
40
Hallo,

ich nutze zwei Proxmox-Server in einem Cluster.
Einer der beiden hat seit heute ein neues NIC (HP NC552 SFP+). Der andere Server nutzt schon von Beginn an dieses NIC in Proxmox.
Soweit funktioniert es auch im zweiten Server gut, allerdings, sobald ich IOMMU aktiviere, um einen LSI SAS HBA an eine VM durchzuschleifen, habe ich keine Verbindung mehr zu Proxmox und den VMs. Kein Ping, kein Webinterface. Ich komme nur lokal drauf.

IOMMU aktiviert/deaktiviert:
Code:
root@hvrep:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:1e:67:59:b0:75 brd ff:ff:ff:ff:ff:ff
3: enp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:1e:67:59:b0:74 brd ff:ff:ff:ff:ff:ff
4: rename4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether c4:34:6b:fd:91:e0 brd ff:ff:ff:ff:ff:ff
5: enp2s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether c4:34:6b:fd:91:e4 brd ff:ff:ff:ff:ff:ff
6: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether c4:34:6b:fd:91:e4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.29/24 brd 192.168.1.255 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::c634:6bff:fefd:91e4/64 scope link
       valid_lft forever preferred_lft forever
7: tap104i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr104i0 state UNKNOWN group default qlen 1000
    link/ether 9a:7b:1e:59:04:87 brd ff:ff:ff:ff:ff:ff
8: fwbr104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:9a:1c:48:91:8c brd ff:ff:ff:ff:ff:ff
9: fwpr104p0@fwln104i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP group default qlen 1000
    link/ether ca:e9:74:c3:34:07 brd ff:ff:ff:ff:ff:ff
10: fwln104i0@fwpr104p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr104i0 state UP group default qlen 1000
    link/ether 8e:9a:1c:48:91:8c brd ff:ff:ff:ff:ff:ff
root@hvrep:~#

Code:
02:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
        Subsystem: Hewlett-Packard Company NC552SFP 2-port 10Gb Server Adapter
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 18
        Region 0: Memory at c2a10000 (64-bit, non-prefetchable) [size=16K]
        Region 2: Memory at c29e0000 (64-bit, non-prefetchable) [size=128K]
        Region 4: Memory at c29c0000 (64-bit, non-prefetchable) [size=128K]
        Expansion ROM at c2940000 [disabled] [size=256K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI-X: Enable+ Count=32 Masked-
                Vector table: BAR=0 offset=00002000
                PBA: BAR=0 offset=00003000
        Capabilities: [c0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <16us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr+ UnsuppReq+ AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [b8] Vital Product Data
                Product Name: 614203-B21, NIC PF
                Read-only fields:
                        [PN] Part number: 614201-001
                        [SN] Serial number: THT4229H05
                        [V0] Vendor specific: BK42205255
                        [MN] Manufacture ID: 36 31 34 32 30 31 2d 30 30 31
                        [EC] Engineering changes: A-5120
                        [FN] Unknown: 36 31 35 34 30 36 2d 30 30 31
                        [VA] Vendor specific: 5422
                        [VB] Vendor specific: PW=11W; PCIE X8 GEN 2
                        [V1] Vendor specific: HP NC552SFP 10GbE 2P Flex-10 Adapter
                        [V2] Vendor specific: NC552SFP
                        [V4] Vendor specific: 1
                        [V5] Vendor specific: OCe11102-NX-HP
                        [V6] Vendor specific: A0:1,D0:1
                        [RV] Reserved: checksum good, 39 byte(s) reserved
                End
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+ UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq+ ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy-
                IOVSta: Migration-
                Initial VFs: 0, Total VFs: 0, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 0, stride: 1, Device ID: 0710
                Supported Page Size: 00000557, System Page Size: 00000001
                Region 0: Memory at 0000000000000000 (64-bit, non-prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [160 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 1
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [168 v1] Device Serial Number c4-34-6b-ff-fe-fd-91-e0
        Capabilities: [12c v1] Transaction Processing Hints
                No steering table available
        Kernel driver in use: be2net
        Kernel modules: be2net

Deaktiviere ich IOMMU in grub, funktioniert die Netzwerkverbindung wie gewohnt.

Das Phänomen lässt sich noch einfacher reproduzieren. Nehme ich die VM mit dem LSI SAS aus dem Autostart raus, bootet Proxmox wie gehabt. Sobald ich die VM mit durchgeschleiftem LSI SAS starte, bricht die Netzwerkverbindung zusammen.
Fast so, als würde er nicht nur den LSI SAS sondern auch das HP NIC durchschleifen.

Zur Hardware der betroffenen Maschine:
Intel S1200BTL
Xeon E3-1240v2
16GB DDR4 ECC
LSI SAS 92124i4e
Dell PV124T LTO4 Library

Könnte das mit den IOMMU Groups zusammenhängen?

Grüße
 

Attachments

  • 01.png
    01.png
    37.8 KB · Views: 13
Last edited:
Es lag tatsächlich an den IOMMU Groups.
Mit der Ergänzung "quiet intel_iommu=on pcie_acs_override=downstream,multifunction" in Grub funktioniert es nun.
 
Supi, freut mich das es nun funktioniert! :) Bitte markiere deinen Thread als gelöst. Danke.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!