Hello everyone.
I have a problem with a cluster that I'm currently building from scratch. This is a small 2-node cluster installed in 7.2-1 and upgraded to 7.2-11 via the non-production ProxMox repositories. The 2 nodes are identical and have 4 networks configured:
The first 3 works fine, but eno1/vmbr1 doesn't work properly. From both a reboot, a cold start or the application of networks configuration of the nodes, eno1 doesn’t come UP and vmbr1 doesn’t show in the ‘ip a’ command:
For some reason, ethtool sees eno1 as disconnected:
However, I can bring the interface UP manually without issues. Vmbr1 is still missing, however:
Then, I can start the bridge manually without problem:
If I start the interface and bridge manually, the VMs work without problem. Otherwise, the VMs fail to start as vmbr1 does not exist:
The problem is the same on both nodes (same config) and strangely, if I configure eno1 with just an IP without a bridge, it UPs well at boot or after the configurations. So it doesn't seem to be a hardware issue.
From my searches, I come up to a post from 2020 showing a similar problem, linked to VLAN awareness and ifupdown2 :
https://forum.proxmox.com/threads/bridge-cant-be-found-and-vm-failed-to-start.63138/
However, in my case enabling or not the VLAN awareness doesn’t change anything and as long I know, ProxMox 7.2 comes with ifupdown2 out of the box. I have a similar (single node) setup at home and the VLAN bridge works without problems, so if someone can help me figure it out why eno1/vmbr1 doesn’t start automatically I will be grateful.
Here is some additional information :
Networks config :
Kernel version :
Ethernet cards :
I have a problem with a cluster that I'm currently building from scratch. This is a small 2-node cluster installed in 7.2-1 and upgraded to 7.2-11 via the non-production ProxMox repositories. The 2 nodes are identical and have 4 networks configured:
- eno3/vmbr0 : Administration network (WebUI, ssh, etc).
- eno2 : Internode communications
- enp4s0f0 : External CEPH cluster access
- eno1/vmbr1 : VLAN network access for the VMs
The first 3 works fine, but eno1/vmbr1 doesn't work properly. From both a reboot, a cold start or the application of networks configuration of the nodes, eno1 doesn’t come UP and vmbr1 doesn’t show in the ‘ip a’ command:
root@abeehouse-node-1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 18:66:da:93:e7:9c brd ff:ff:ff:ff:ff:ff
altname enp2s0f0
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 18:66:da:93:e7:9d brd ff:ff:ff:ff:ff:ff
altname enp2s0f1
inet 10.40.5.20/24 scope global eno2
valid_lft forever preferred_lft forever
inet6 fe80::1a66:daff:fe93:e79d/64 scope link
valid_lft forever preferred_lft forever
4: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
link/ether 18:66:da:93:e7:9e brd ff:ff:ff:ff:ff:ff
altname enp3s0f0
5: eno4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 18:66:da:93:e7:9f brd ff:ff:ff:ff:ff:ff
altname enp3s0f1
6: enp4s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
link/ether a0:36:9f:e9:74:90 brd ff:ff:ff:ff:ff:ff
inet 10.40.7.20/24 scope global enp4s0f0
valid_lft forever preferred_lft forever
inet6 fe80::a236:9fff:fee9:7490/64 scope link
valid_lft forever preferred_lft forever
7: enp4s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether a0:36:9f:e9:74:92 brd ff:ff:ff:ff:ff:ff
8: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 18:66:da:93:e7:9e brd ff:ff:ff:ff:ff:ff
inet 10.40.3.20/24 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::1a66:daff:fe93:e79e/64 scope link
valid_lft forever preferred_lft forever
For some reason, ethtool sees eno1 as disconnected:
root@abeehouse-node-1:~# ethtool eno1
Settings for eno1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
MDI-X: Unknown
Supports Wake-on: g
Wake-on: d
Current message level: 0x000000ff (255)
drv probe link timer ifdown ifup rx_err tx_err
Link detected: no
However, I can bring the interface UP manually without issues. Vmbr1 is still missing, however:
root@abeehouse-node-1:~# ip link set eno1 up
root@abeehouse-node-1:~# ip link show eno1
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 18:66:da:93:e7:9c brd ff:ff:ff:ff:ff:ff
altname enp2s0f0
root@abeehouse-node-1:~# ip link show vmbr1
Device "vmbr1" does not exist.
Then, I can start the bridge manually without problem:
root@abeehouse-node-1:~# ifup vmbr1
root@abeehouse-node-1:~# ip link show vmbr1
9: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 18:66:da:93:e7:9c brd ff:ff:ff:ff:ff:ff
If I start the interface and bridge manually, the VMs work without problem. Otherwise, the VMs fail to start as vmbr1 does not exist:
bridge 'vmbr1' does not exist
kvm: -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on: network script /var/lib/qemu-server/pve-bridge failed with status 512
TASK ERROR: start failed: QEMU exited with code 1
The problem is the same on both nodes (same config) and strangely, if I configure eno1 with just an IP without a bridge, it UPs well at boot or after the configurations. So it doesn't seem to be a hardware issue.
From my searches, I come up to a post from 2020 showing a similar problem, linked to VLAN awareness and ifupdown2 :
https://forum.proxmox.com/threads/bridge-cant-be-found-and-vm-failed-to-start.63138/
However, in my case enabling or not the VLAN awareness doesn’t change anything and as long I know, ProxMox 7.2 comes with ifupdown2 out of the box. I have a similar (single node) setup at home and the VLAN bridge works without problems, so if someone can help me figure it out why eno1/vmbr1 doesn’t start automatically I will be grateful.
Here is some additional information :
Networks config :
root@abeehouse-node-1:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
auto eno3
iface eno3 inet manual
#ADMIN ACCESS PORT
auto eno1
iface eno1 inet manual
#VLAN ACCESS PORT
auto eno2
iface eno2 inet static
address 10.40.5.20/24
#Intercom abeehouse. /!\ NE PAS UTILISER POUR LES VM /!\
iface eno4 inet manual
#NOT USED
auto enp4s0f0
iface enp4s0f0 inet static
address 10.40.7.20/24
mtu 9000
#Acces à honeycomb. /!\ NE PAS UTILISER POUR LES VM /!\
iface enp4s0f1 inet manual
#NOT USED
auto vmbr0
iface vmbr0 inet static
address 10.40.3.20/24
gateway 10.40.3.254
bridge-ports eno3
bridge-stp off
bridge-fd 0
#Acces réseau admin. /!\ NE PAS UTILISER POUR LES VM /!\
auto vmbr1
iface vmbr1 inet manual
bridge-ports eno1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#ACCES VLAN POUR VM
Kernel version :
root@abeehouse-node-1:~# uname -a
Linux abeehouse-node-1 5.15.60-2-pve #1 SMP PVE 5.15.60-2 (Tue, 04 Oct 2022 16:52:28 +0200) x86_64 GNU/Linux
Ethernet cards :
root@abeehouse-node-1:~# lspci | grep Ethernet
02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
03:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
03:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)