bnx2x Network Card Driver stopped working (NetXtreme II BCM57810 10 Gigabit Ethernet NIC)

Flashnet

New Member
Mar 21, 2025
1
0
1
Hello all.

Our 10GB network card recently stopped working out of the blue.

The interfaces are shown as present but wont come up

root@:~# ip link list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether 84:69:93:85:64:fb brd ff:ff:ff:ff:ff:ff
altname enp0s31f6
3: ens1f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master vmbr1 state DOWN mode DEFAULT group default qlen 1000
link/ether 8c:dc:d4:0f:98:34 brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
4: ens1f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master vmbr2 state DOWN mode DEFAULT group default qlen 1000
link/ether 8c:dc:d4:0f:98:30 brd ff:ff:ff:ff:ff:ff
altname enp1s0f0
5: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 84:69:93:85:64:fb brd ff:ff:ff:ff:ff:ff
6: vmbr1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/ether 8c:dc:d4:0f:98:34 brd ff:ff:ff:ff:ff:ff
7: vmbr2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/ether 8c:dc:d4:0f:98:30 brd ff:ff:ff:ff:ff:ff
8: tap113i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr113i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 9a:8f:79:ed:d8:cf brd ff:ff:ff:ff:ff:ff
9: fwbr113i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 86:ab:c5:fe:84:b6 brd ff:ff:ff:ff:ff:ff
10: fwpr113p0@fwln113i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether e2:23:f6:d7:4a:02 brd ff:ff:ff:ff:ff:ff
11: fwln113i0@fwpr113p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr113i0 state UP mode DEFAULT group default qlen 1000
link/ether 86:ab:c5:fe:84:b6 brd ff:ff:ff:ff:ff:ff
12: tap114i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr114i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 96:33:49:df:74:7e brd ff:ff:ff:ff:ff:ff
13: fwbr114i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 12:8a:1d:80:9b:ae brd ff:ff:ff:ff:ff:ff
14: fwpr114p0@fwln114i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether 12:83:21:95:f4:14 brd ff:ff:ff:ff:ff:ff
15: fwln114i0@fwpr114p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr114i0 state UP mode DEFAULT group default qlen 1000
link/ether 12:8a:1d:80:9b:ae brd ff:ff:ff:ff:ff:ff
16: tap115i0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master fwbr115i0 state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 2a:ac:02:66:86:8f brd ff:ff:ff:ff:ff:ff
23: fwbr115i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 9e:83:7a:c5:6c:09 brd ff:ff:ff:ff:ff:ff
24: fwpr115p0@fwln115i0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
link/ether 96:41:57:50:9c:8d brd ff:ff:ff:ff:ff:ff
25: fwln115i0@fwpr115p0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master fwbr115i0 state UP mode DEFAULT group default qlen 1000
link/ether 9e:83:7a:c5:6c:09 brd ff:ff:ff:ff:ff:ff
root@FLD-PMD801:~#

The driver seems to panic and then crash?

root@FLD-PMD801:~# journalctl -xe | grep bnx2x
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_write_dmae:611(ens1f0)]DMAE returned failure -1
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_issue_dmae_with_comp:563(ens1f0)]DMAE timeout!
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_write_dmae:611(ens1f0)]DMAE returned failure -1
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_issue_dmae_with_comp:563(ens1f0)]DMAE timeout!
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_write_dmae:611(ens1f0)]DMAE returned failure -1
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_issue_dmae_with_comp:563(ens1f0)]DMAE timeout!
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_write_dmae:611(ens1f0)]DMAE returned failure -1
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_send_final_clnup:1423(ens1f0)]FW final cleanup did not succeed
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_send_final_clnup:1426(ens1f0)]driver assert
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:929(ens1f0)]begin crash dump -----------------
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:937(ens1f0)]def_idx(0x0) def_att_idx(0x0) attn_state(0x0) spq_prod_idx(0x0) next_stats_cnt(0x0)
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:940(ens1f0)]DSB: attn bits(0x0) ack(0x0) id(0x0) idx(0x0)
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:945(ens1f0)] def (0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0) igu_sb_id(0x0) igu_seg_id(0x0) pf_id(0x0) vnic_id(0x0) vf_id(0x0) vf_valid (0x0) state(0x0)
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x 0000:01:00.0 ens1f0: bc 7.13.23
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:1193(ens1f0)]Idle check (1st round) ----------
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: ucorr_err_status is not 0.Value is 0x44000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PCIE: corr_err_status is not 0x2000.Value is 0x20c0
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: Func 2 3 4: attentions register is not 0x10240902.Value is 0x4010040
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: Func 5 6 7: attentions register is not 0x10240902.Value is 0x4010040
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: was_error for PFs 0-7 is not 0.Value is 0x3
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Completion received with error. (2:0) - PFID. (3) - VF_VALID. (9:4) - VFID. (11:10) - Error code : 0 - Completion Timeout; 1 - Unsupported Request; 2 - Completer Abort. (12) - valid bit.Value is 0x1001
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Address(31:0) is not 0.Value is 0xffffd254
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Error details register is not 0. (4:0) VQID. (23:21) - PFID. (24) - VF_VALID. (30:25) - VFID.Value is 0x20041c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Error details 2nd register is not 0. (21) - was_error set; (22) - BME cleared; (23) - FID_enable cleared; (24) - VF with parent PF FLR_request or IOV_disable_request.Value is 0x2398000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE: Error in master read address(31:0) is not 0.Value is 0xfe968000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master read Error details register is not 0. (4:0) VQID. (23:21) - PFID. (24) - VF_VALID. (30:25) - VFID.Value is 0x20481c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master read Error details 2nd register is not 0. (21) - was_error set; (22) - BME cleared; (23) - FID_enable cleared; (24) - VF with parent PF FLR_request or IOV_disable_request.Value is 0x2398000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Completion received with error. Error details register is not 0. (15:0) - ECHO. (28:16) - Sub Request length plus start_offset_2_0 minus 1.Value is 0x1ff8000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Completion received with error. Error details 2nd register is not 0. (4:0) - VQ ID. (8:5) - client ID. (9) - valid bit.Value is 0x33c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Interrupt status 0 is not 0.Value is 0x1800000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Interrupt status is not 0.Value is 0x4
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING MISC: pcie_rst_b was asserted without perst assertion.Value is 0x1
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING XSEM: interrupt 0 is active.Value is 0x10000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_idle_chk:3179(ens1f0)]failed (with 3 errors, 15 warnings)
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:1195(ens1f0)]Idle check (2nd round) ----------
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: ucorr_err_status is not 0.Value is 0x44000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PCIE: corr_err_status is not 0x2000.Value is 0x20c0
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: Func 2 3 4: attentions register is not 0x10240902.Value is 0x4010040
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2932(ens1f0)]ERROR PCIE: Func 5 6 7: attentions register is not 0x10240902.Value is 0x4010040
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: was_error for PFs 0-7 is not 0.Value is 0x3
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Completion received with error. (2:0) - PFID. (3) - VF_VALID. (9:4) - VFID. (11:10) - Error code : 0 - Completion Timeout; 1 - Unsupported Request; 2 - Completer Abort. (12) - valid bit.Value is 0x1001
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Address(31:0) is not 0.Value is 0xffffd254
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Error details register is not 0. (4:0) VQID. (23:21) - PFID. (24) - VF_VALID. (30:25) - VFID.Value is 0x20041c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master write. Error details 2nd register is not 0. (21) - was_error set; (22) - BME cleared; (23) - FID_enable cleared; (24) - VF with parent PF FLR_request or IOV_disable_request.Value is 0x2398000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE: Error in master read address(31:0) is not 0.Value is 0xfe968000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master read Error details register is not 0. (4:0) VQID. (23:21) - PFID. (24) - VF_VALID. (30:25) - VFID.Value is 0x20481c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Error in master read Error details 2nd register is not 0. (21) - was_error set; (22) - BME cleared; (23) - FID_enable cleared; (24) - VF with parent PF FLR_request or IOV_disable_request.Value is 0x2398000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Completion received with error. Error details register is not 0. (15:0) - ECHO. (28:16) - Sub Request length plus start_offset_2_0 minus 1.Value is 0x1ff8000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Completion received with error. Error details 2nd register is not 0. (4:0) - VQ ID. (8:5) - client ID. (9) - valid bit.Value is 0x33c
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PXP2: Interrupt status 0 is not 0.Value is 0x1800000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING PGLUE_B: Interrupt status is not 0.Value is 0x4
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING MISC: pcie_rst_b was asserted without perst assertion.Value is 0x1
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_self_test_log:2939(ens1f0)]WARNING XSEM: interrupt 0 is active.Value is 0x10000
Mar 21 16:32:59 FLD-PMD801 kernel: [bnx2x_idle_chk:3179(ens1f0)]failed (with 3 errors, 15 warnings)
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_mc_assert:758(ens1f0)]Chip Revision: everest3, FW Version: 7_13_21
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_panic_dump:1201(ens1f0)]end crash dump -----------------
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x 0000:01:00.0 ens1f0: bc 7.13.23
Mar 21 16:32:59 FLD-PMD801 kernel: bnx2x: [bnx2x_nic_load:2751(ens1f0)]HW init failed, aborting

The driver version currently in use

filename: /lib/modules/6.8.12-8-pve/kernel/drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko
firmware: bnx2x/bnx2x-e2-7.13.15.0.fw
firmware: bnx2x/bnx2x-e1h-7.13.15.0.fw
firmware: bnx2x/bnx2x-e1-7.13.15.0.fw
firmware: bnx2x/bnx2x-e2-7.13.21.0.fw
firmware: bnx2x/bnx2x-e1h-7.13.21.0.fw
firmware: bnx2x/bnx2x-e1-7.13.21.0.fw
license: GPL
description: QLogic BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF Driver
author: Eliezer Tamir
srcversion: 9CE321FFAD4DD55B8C0EFE5
alias: pci:v000014E4d0000163Fsv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Esv*sd*bc*sc*i*
alias: pci:v000014E4d0000163Dsv*sd*bc*sc*i*
alias: pci:v00001077d000016ADsv*sd*bc*sc*i*
alias: pci:v000014E4d000016ADsv*sd*bc*sc*i*
alias: pci:v00001077d000016A4sv*sd*bc*sc*i*
alias: pci:v000014E4d000016A4sv*sd*bc*sc*i*
alias: pci:v000014E4d000016ABsv*sd*bc*sc*i*
alias: pci:v000014E4d000016AFsv*sd*bc*sc*i*
alias: pci:v000014E4d000016A2sv*sd*bc*sc*i*
alias: pci:v00001077d000016A1sv*sd*bc*sc*i*
alias: pci:v000014E4d000016A1sv*sd*bc*sc*i*
(removed a couple aliases due to character limit)
depends: libcrc32c,mdio
retpoline: Y
intree: Y
name: bnx2x
vermagic: 6.8.12-8-pve SMP preempt mod_unload modversions
sig_id: PKCS#7
signer: Build time autogenerated kernel key
sig_key: 3E:29:1E:02:41:9D:67:AE:03:01:1F:A8:C3:6A:5E:4C:E9:DE:ED:F6
sig_hashalgo: sha512
parm: num_queues: Set number of queues (default is as a number of CPUs) (int)
parm: disable_tpa: Disable the TPA (LRO) feature (int)
parm: int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (int)
parm: dropless_fc: Pause on exhausted host ring (int)
parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int)
parm: debug: Default debug msglevel (int)

Here are the current settings for one of the interfaces
Settings for ens1f0:
Supported ports: [ TP ]
Supported link modes: 100baseT/Half 100baseT/Full
1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 100baseT/Half 100baseT/Full
1000baseT/Full
10000baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 17
Transceiver: internal
MDI-X: Unknown
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000000 (0)

Link detected: no

Honestly I am not sure where to go from here it was working well until it wasnt. Im pretty sure its not a hardware problem as I have an identical NIC and I got the same issues after swapping them. I also tried an older kernel version but no luck. I would be greatful if someone could help me figure out where to go from here.