vmbr0 doesn't come up on boot

danb35

Renowned Member
Oct 31, 2015
84
6
73
I've been dealing with this problem since I installed PVE, but since I didn't reboot the servers very often I didn't bother dealing with it. Now I'm needing to reboot the servers more in the course of troubleshooting another issue, and this is becoming more of a hassle than it had been.

I have a three-node PVE cluster. Each node is one blade of a Dell PowerEdge C6100, 2x Xeon X5650s, a Chelsio T420-CR 10G NIC, and either 48 GB (for two of them) or 96 GB (for the other) of RAM. Each boots from a two-SSD ZFS mirror, and all the VMs are stored via NFS on a FreeNAS box.

The problem is that vmbr0 doesn't come up on boot. It comes up immediately if I log in to the console and run ifup vmbr0, but since that means a trip out to the workshop, that isn't very convenient. I'm seeing exactly the same behavior on each node of the cluster.

Here's /etc/network/interfaces on pve1 and pve2 (they're identical):
Code:
auto lo
iface lo inet loopback

iface enp3s0f4 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.3
        netmask 255.255.255.0
        gateway 192.168.1.1
        bridge_ports enp3s0f4
        bridge_stp off
        bridge_fd 0

iface eno1 inet manual

iface eno2 inet manual

iface enp3s0f4d1 inet manual

pve3 is very similar, but its NIC is designated differently. eno1 and eno2 are the onboard NICs, which I'm not using; enp3s0f4d1 is the second port on the Chelsio NIC which I'm also not using. Where should I be looking to track down why vmbr0 isn't coming up on boot?
 
hi,

(they're identical):
the assigned IPs are different right?

The problem is that vmbr0 doesn't come up on boot. It comes up immediately if I log in to the console and run ifup vmbr0
you can check dmesg for any related entries.
also check /var/log/syslog and journalctl for a timeframe where the issue happened (boot time??)
 
the assigned IPs are different right?
Yes, of course--should have mentioned that.

Some stuff in dmesg looks like it might be relevant:
Code:
root@pve1:~# dmesg | grep cxgb
[    2.894565] cxgb4 0000:03:00.4: Direct firmware load for cxgb4/t4fw.bin failed with error -2
[    2.894570] cxgb4 0000:03:00.4: unable to load firmware image cxgb4/t4fw.bin, error -2
[    2.894897] cxgb4 0000:03:00.4: Coming up as MASTER: Initializing adapter
[    3.614591] cxgb4 0000:03:00.4: Direct firmware load for cxgb4/t4-config.txt failed with error -2
[    3.962594] cxgb4 0000:03:00.4: Successfully configured using Firmware Configuration File "Firmware Default", version 0x0, computed checksum 0x0
[    4.170559] cxgb4 0000:03:00.4: max_ordird_qp 21 max_ird_adapter 5376
[    4.218552] cxgb4 0000:03:00.4: Current filter mode/mask 0x32b:0x21
[    4.305262] cxgb4 0000:03:00.4: 98 MSI-X vectors allocated, nic 16 per uld 16
[    4.305272] cxgb4 0000:03:00.4: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
[    4.339948] cxgb4 0000:03:00.4 eth0: eth0: Chelsio T420-SO-CR (0000:03:00.4) 1G/10GBASE-SFP
[    4.340311] cxgb4 0000:03:00.4 eth1: eth1: Chelsio T420-SO-CR (0000:03:00.4) 1G/10GBASE-SFP
[    4.340399] cxgb4 0000:03:00.4: Chelsio T420-SO-CR rev 2
[    4.340401] cxgb4 0000:03:00.4: S/N: PT11121149, P/N: 110112440C0
[    4.340403] cxgb4 0000:03:00.4: Firmware version: 1.23.4.0
[    4.340405] cxgb4 0000:03:00.4: Bootstrap version: 255.255.255.255
[    4.340407] cxgb4 0000:03:00.4: TP Microcode version: 0.1.9.4
[    4.340409] cxgb4 0000:03:00.4: No Expansion ROM loaded
[    4.340411] cxgb4 0000:03:00.4: Serial Configuration version: 0x7071102
[    4.340413] cxgb4 0000:03:00.4: VPD version: 0x1
[    4.340415] cxgb4 0000:03:00.4: Configuration: RNIC MSI-X, Offload capable
[    4.342195] cxgb4 0000:03:00.4 enp3s0f4: renamed from eth0
[    4.354890] cxgb4 0000:03:00.4 enp3s0f4d1: renamed from eth1
[  163.666606] cxgb4 0000:03:00.4 enp3s0f4: SR module inserted
[  164.698039] cxgb4 0000:03:00.4 enp3s0f4: link up, 10Gbps, full-duplex, Tx/Rx PAUSE
[  166.297213] cxgb4 0000:03:00.4: Port 0 link down, reason: Link Down
[  166.297230] cxgb4 0000:03:00.4 enp3s0f4: link down
[  166.896905] cxgb4 0000:03:00.4 enp3s0f4: link up, 10Gbps, full-duplex, Tx/Rx PAUSE
[  166.996850] cxgb4 0000:03:00.4: Port 0 link down, reason: Link Down
[  166.996865] cxgb4 0000:03:00.4 enp3s0f4: link down
[  167.496598] cxgb4 0000:03:00.4 enp3s0f4: link up, 10Gbps, full-duplex, Tx/Rx PAUSE
The "unable to load firmware" error is also addressed here (https://forum.proxmox.com/threads/cxgb4-firmware-missing-from-pve-firmware-package.68351/), but I don't see that there was a clear resolution there.
Code:
root@pve1:~# dmesg | grep vmbr
[  163.666617] vmbr0: port 1(enp3s0f4) entered blocking state
[  163.666620] vmbr0: port 1(enp3s0f4) entered disabled state
[  164.698081] vmbr0: port 1(enp3s0f4) entered blocking state
[  164.698084] vmbr0: port 1(enp3s0f4) entered forwarding state
[  164.698191] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
[  166.297582] vmbr0: port 1(enp3s0f4) entered disabled state
[  166.896937] vmbr0: port 1(enp3s0f4) entered blocking state
[  166.896940] vmbr0: port 1(enp3s0f4) entered forwarding state
[  167.315036] vmbr0: port 1(enp3s0f4) entered disabled state
[  167.496627] vmbr0: port 1(enp3s0f4) entered blocking state
[  167.496630] vmbr0: port 1(enp3s0f4) entered forwarding state
[  259.070092] vmbr0: port 2(tap100i0) entered blocking state
[  259.070095] vmbr0: port 2(tap100i0) entered disabled state
[  259.070280] vmbr0: port 2(tap100i0) entered blocking state
[  259.070283] vmbr0: port 2(tap100i0) entered forwarding state
[  261.092649] vmbr0: port 3(fwpr101p0) entered blocking state
[  261.092652] vmbr0: port 3(fwpr101p0) entered disabled state
[  261.092815] vmbr0: port 3(fwpr101p0) entered blocking state
[  261.092817] vmbr0: port 3(fwpr101p0) entered forwarding state
[  267.820343] vmbr0: port 2(tap100i0) entered disabled state
[  268.816154] vmbr0: port 3(fwpr101p0) entered disabled state
[  268.844326] vmbr0: port 3(fwpr101p0) entered disabled state
[  282.508661] vmbr0: port 2(fwpr120p0) entered blocking state
[  282.508664] vmbr0: port 2(fwpr120p0) entered disabled state
[  282.508824] vmbr0: port 2(fwpr120p0) entered blocking state
[  282.508826] vmbr0: port 2(fwpr120p0) entered forwarding state
[  283.056640] vmbr0: port 2(fwpr120p0) entered disabled state
[  283.076775] vmbr0: port 2(fwpr120p0) entered disabled state
[  301.663844] vmbr0: port 2(fwpr120p0) entered blocking state
[  301.663847] vmbr0: port 2(fwpr120p0) entered disabled state
[  301.664006] vmbr0: port 2(fwpr120p0) entered blocking state
[  301.664008] vmbr0: port 2(fwpr120p0) entered forwarding state
[  310.796468] vmbr0: port 3(fwpr101p0) entered blocking state
[  310.796471] vmbr0: port 3(fwpr101p0) entered disabled state
[  310.796607] vmbr0: port 3(fwpr101p0) entered blocking state
[  310.796609] vmbr0: port 3(fwpr101p0) entered forwarding state
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!