This morning we had a major incident, all our proxmox nodes where fenced at the same time.
In the log, this seems to be the problem:
Sep 1 10:50:14 hostname kernel: [932130.006753] show_signal_msg: 6 callbacks suppressed
Sep 1 10:50:14 hostname kernel: [932130.006757]...
It are even the same NICs (Broadcom BCM57412 NetXtreme-E 10Gb). Just 2x dual port cards.
# dmesg|grep -E -i "bnx|ens"
[ 4.118343] Broadcom NetXtreme-C/E driver bnxt_en v1.10.0
[ 4.131885] bnxt_en 0000:5e:00.0 eth0: Broadcom BCM57412 NetXtreme-E 10Gb Ethernet found at mem b8a10000, node...
Thanks for the answers.
It's very strange:
What work is: if I reboot the server with only one port in the bond (it works), the bridges come up (and all is working fine),
After that, I can insert the second port (on the other network card) in the bond (it keeps working).
Some other thing we've seen:
If we use the bond-slaves on the same network cards, it works.
Before we used the LACP between 2 ports on 2 different networks cards in the server.
Someone have a idea why?
We are using 2 bond interfaces for storage (2x10Gb) and a trunk for the VMs (2x10Gb).
We want to split the trunk for the VM in VLAN bridges so we can assign them to our VMs.
iface lo inet loopback
iface eno1 inet static