[SOLVED] Netzwerkkarte startet nicht beim booten

MLOrion

Member
Feb 8, 2021
14
12
8
51
Hamburg
www.orionbulkers.com
Moin,

wir haben einen neuen Supermicro Server mit

2x 1GBase-T LAN Ports via Broadcom BCM5720L

Beim booten mit ProxMox 8.2.2 starten die Karten icht, erst wenn man den netzwerk Service neu startet.

service networking restart

Code:
[  111.785752] bnxt_en 0000:05:00.0: QPLIB: bnxt_re_is_fw_stalled: FW STALL Detected. cmdq[0xe]=0x3 waited (101936 > 100000) msec active 1
[  111.785788] bnxt_en 0000:05:00.0 bnxt_re0: Failed to modify HW QP
[  111.785804] infiniband bnxt_re0: Couldn't change QP1 state to INIT: -110
[  111.785818] infiniband bnxt_re0: Couldn't start port
[  111.785888] bnxt_en 0000:05:00.0 bnxt_re0: Failed to destroy HW QP

Code:
[  112.037741] bnxt_en 0000:05:00.0 bnxt_re0: Free MW failed: 0xffffff92
[  112.040857] infiniband bnxt_re0: Couldn't open port 1
[  112.045744] infiniband bnxt_re0: Device registered with IB successfully
[  130.142150] audit: type=1400 audit(1716376119.946:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="pve-container-mounthotplug" pid=1671 comm="apparmor_parser"
[  130.143290] audit: type=1400 audit(1716376119.946:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=1675 comm="apparmor_parser"
[  130.143295] audit: type=1400 audit(1716376119.946:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-copy" pid=1673 comm="apparmor_parser"
[  130.143299] audit: type=1400 audit(1716376119.947:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="swtpm" pid=1677 comm="apparmor_parser"
[  130.143452] audit: type=1400 audit(1716376119.947:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1669 comm="apparmor_parser"
[  130.145678] audit: type=1400 audit(1716376119.947:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1669 comm="apparmor_parser"
[  130.146697] audit: type=1400 audit(1716376119.948:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/chronyd" pid=1679 comm="apparmor_parser"
[  130.146702] audit: type=1400 audit(1716376119.949:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="tcpdump" pid=1678 comm="apparmor_parser"
[  130.146705] audit: type=1400 audit(1716376119.950:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=1668 comm="apparmor_parser"
[  130.146709] audit: type=1400 audit(1716376119.950:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-cgns" pid=1668 comm="apparmor_parser"
[  130.254771] softdog: initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)
[  130.255552] softdog:              soft_reboot_cmd=<not set> soft_active_on_boot=0
[  130.307509] RPC: Registered named UNIX socket transport module.
[  130.307936] RPC: Registered udp transport module.
[  130.308338] RPC: Registered tcp transport module.
[  130.308740] RPC: Registered tcp-with-tls transport module.
[  130.309132] RPC: Registered tcp NFSv4.1 backchannel transport module.
[  214.185841] bnxt_en 0000:05:00.1: QPLIB: bnxt_re_is_fw_stalled: FW STALL Detected. cmdq[0xe]=0x3 waited (102055 > 100000) msec active 1
[  214.185875] bnxt_en 0000:05:00.1 bnxt_re1: Failed to modify HW QP
[  214.185890] infiniband bnxt_re1: Couldn't change QP1 state to INIT: -110
[  214.185904] infiniband bnxt_re1: Couldn't start port
[  214.186034] bnxt_en 0000:05:00.1 bnxt_re1: Failed to destroy HW QP
[  214.186072] bnxt_en 0000:05:00.1 bnxt_re1: Free MW failed: 0xffffff92
[  214.186416] infiniband bnxt_re1: Couldn't open port 1
[  214.187863] infiniband bnxt_re1: Device registered with IB successfully
[  395.176066] vmbr0: port 1(enp5s0f0np0) entered blocking state
[  395.176089] vmbr0: port 1(enp5s0f0np0) entered disabled state
[  395.176123] bnxt_en 0000:05:00.0 enp5s0f0np0: entered allmulticast mode
[  395.176198] bnxt_en 0000:05:00.0 enp5s0f0np0: entered promiscuous mode
[  395.235556] bnxt_en 0000:05:00.0 enp5s0f0np0: NIC Link is Up, 1000 Mbps full duplex, Flow control: none
[  395.235587] bnxt_en 0000:05:00.0 enp5s0f0np0: EEE is not active
[  395.235600] bnxt_en 0000:05:00.0 enp5s0f0np0: FEC autoneg off encoding: None
[  395.248869] bnxt_en 0000:05:00.0 bnxt_re0: Failed to add GID: 0xffffff92
[  395.248898] infiniband bnxt_re0: add_roce_gid GID add failed port=1 index=2
[  395.248913] __ib_cache_gid_add: unable to add gid 0000:0000:0000:0000:0000:ffff:c0a8:c8fb error=-110
[  395.248932] bnxt_en 0000:05:00.0 bnxt_re0: Failed to add GID: 0xffffff92
[  395.248946] infiniband bnxt_re0: add_roce_gid GID add failed port=1 index=2
[  395.248959] __ib_cache_gid_add: unable to add gid 0000:0000:0000:0000:0000:ffff:c0a8:c8fb error=-110
[  395.249229] vmbr0: port 1(enp5s0f0np0) entered blocking state
[  395.249268] vmbr0: port 1(enp5s0f0np0) entered forwarding state
[  395.249997] bnxt_en 0000:05:00.0 bnxt_re0: Failed to add GID: 0xffffff92
[  395.250044] infiniband bnxt_re0: add_roce_gid GID add failed port=1 index=2
[  395.250061] __ib_cache_gid_add: unable to add gid 0000:0000:0000:0000:0000:ffff:c0a8:c8fb error=-110
[  395.250079] bnxt_en 0000:05:00.0 bnxt_re0: Failed to add GID: 0xffffff92
[  395.250247] infiniband bnxt_re0: add_roce_gid GID add failed port=1 index=2
[  395.250260] __ib_cache_gid_add: unable to add gid 0000:0000:0000:0000:0000:ffff:c0a8:c8fb error=-110
[  545.131939] vmbr0: the hash_elasticity option has been deprecated and is always 16




dmesg output mit dem Fehlern is angehängt

irgendeine Idee woher das kommt ?
 

Attachments

  • DMESG_Output.txt
    12.7 KB · Views: 1
Guten Morgen

Stimmt

Der Grund für das Problem findet sich unteranderem hier im Forum
https://forum.proxmox.com/threads/broadcom-nics-down-after-pve-8-2-kernel-6-8.146185/

Es schein ein Problem mit der Firmware der Broadcom Karten zu sein.
Broadcom? ein Schelm wer böses denkt ;-)

Wir auch immer, ich habe die internen Netzwerkkarten mittels Jumper deaktiviert und Intel Karten eingebaut.
Der Server geht auf eines unserer Schiffe und da kann ich kein "gebastel" gebrauchen.
 
Guten Morgen

Stimmt

Der Grund für das Problem findet sich unteranderem hier im Forum
https://forum.proxmox.com/threads/broadcom-nics-down-after-pve-8-2-kernel-6-8.146185/

Es schein ein Problem mit der Firmware der Broadcom Karten zu sein.
Broadcom? ein Schelm wer böses denkt ;-)

Wir auch immer, ich habe die internen Netzwerkkarten mittels Jumper deaktiviert und Intel Karten eingebaut.
Der Server geht auf eines unserer Schiffe und da kann ich kein "gebastel" gebrauchen.
Dann viel Spaß mit Intel. ;) Da habe ich auch schon ganz schlimme Issues gesehen und das öfter als bei Broadcom.
Eigentlich laufen Mellanox und Broadcom am stabilsten. Meine Broadcom P210p laufen auch mir dem 6.8er Kernel sauber, aber ich halte die Firmware auch meist recht aktuell.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!