Apr 07 00:07:44 fbo-vmh-024 kernel: UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
Apr 07 00:07:44 fbo-vmh-024 kernel: shift exponent 64 is too large for 64-bit type 'long unsigned int'
I managed to reproduce the UBSAN warning with a machine with broadcom NICs here - but did not get the firmware hangs:
bnxt_en 0000:c1:00.1: QPLIB: bnxt_re_is_fw_stalled: FW STALL Detected. cmdq[0xe]=0x3 waited (102364 > 100000) msec active 1
(but that might be due to different firmwares, or because I did not send loads of traffic over the interfaces)
To get the UBSAN warnings (undefined behavior sanitzier - to my knowledge these are warnings printed by the kernel to notify driver-maintainers about potentially problematic uses in their code, but they do not by themselves cause problems) - I needed to enable rdma/ib for the NIC using broadcom's niccli [0] utility (you can download it from the broadcom website, and need to install the utility and the dkms package for the `sliff` driver).
Sadly I don't have too much experience with Broadcom infiniband NICs, and whether changing their settings (or loading the third-party sliff module) while running can cause problems - so please be careful, and don't do this in production!
On a hunch - maybe the BCM57416 in general, or only as on-board NICs on the supermicro board have the RDMA setting enabled by default, while most other broadcom NICs do not (we would have a larger number of reports with problems if many broadcom nics are affected).
You could try disabling the rdma-support on the NICs:
Code:
niccli -i 1 nvm -setoption support_rdma -scope 0 -value 0
niccli -i 1 reset
for the interface index and scope setting - and a general info - please consult the broadcom documentation - I also found the following article from the thomas-krenn wiki helpful (in German):
https://www.thomas-krenn.com/de/wiki/Broadcom_NICCLI_Configuration_Utility
Alternatively you could also try to unload/blocklist the bnxt_re module from getting loaded (afaict it is the module that provides RDMA/IB functionality for broadcom NICs)
If this does not help - I'd suggest to open a new thread here (feel free to mention me
@Stoiko Ivanov, so I do not overlook it), to keep the general thread for the 6.8 kernel less noisy.
[0]
https://techdocs.broadcom.com/us/en...-software-installation-and-configuration.html