Proxmox 5.4 to 6.0 : Strange network issues

I personnaly haven't tried with linux bridging/bonding as we need features only available using openvswitch.

Can I ask, merely out of interest, which ones exactly. Linux bridges have gotten feature richer over time, so the gap between both Technologies regarding this should be a bit closer now.
 
Can I ask, merely out of interest, which ones exactly. Linux bridges have gotten feature richer over time, so the gap between both Technologies regarding this should be a bit closer now.

Hmm, I don't remember 100%. I think it is/was something about having a native vlan (untagged) mixed with tagged vlans, on the same interface.
Like for example native untagged VLAN on the interface or bridge is vlan 10, for management and on the same interface you can use tagged vlan for other traffic.
If i remember correctly it wasn't working and after a few searches it turned out it only works using openvswitch.
 
With VLAN aware bridges that should be possible, IIRC:
See: https://support.cumulusnetworks.com...ode-to-VLAN-aware-Bridge-Mode#creating_an_svi

(always the right column)

I had tried it on proxmox 5.x, I remember the bridge-pvid setting. If I'm not wrong it's supposed to be the primary vlan, but with no luck.
After switching to openvswitch everything worked like a charm (until update to pm 6.x and it's new kernel).
It's maybe worth a try, but still the problem we some have should be adressed somehow. I guess it will be hard to identify the source though...
 
I had tried it on proxmox 5.x, I remember the bridge-pvid setting. If I'm not wrong it's supposed to be the primary vlan, but with no luck.
After switching to openvswitch everything worked like a charm (until update to pm 6.x and it's new kernel).
It's maybe worth a try, but still the problem we some have should be adressed somehow. I guess it will be hard to identify the source though...

PVE 5.0 started with the 4.10 Kernel, so a few features may not have been available then (do not remember for sure anymore).

But yes, you're right, we try to find out in which setup and environment those problems arise, once we can reproduce them it shouldn't be to hard to find out the main issue - especially if it works with the new OVS packages but an old Kernel (knocks wood).
 
This is not an LACP link and cards are connected to two separate switches that are not “stacked”. The config works just fine on Proxmox 5 with VMs and CTs and other traffic. It fails on Proxmox 6 for VM and CT traffic but other traffic is fine with no errors. My other configurations that have LACP and balancing have same issues with Proxmox 6. Now I am using strictly Linux bridging and bonding that works just fine for everything. I would suggest to look into kernel compatibility as noted by other users, which seems to fix issues if downgraded on Proxmox 6.

AFAIU you mean you have two physically separated networks and only the endpoints know about bonding. How are the other endpoints configured? Do they use also openvswitch@Proxmox?
 
I have the same problem with similar setup.
Two non stacked switches with stp enabled connecting 5 servers, every server has 2 sfp ports connected to separate switch.

Code:
auto vmbr1
iface vmbr1 inet dhcp
        bridge_ports enp8s0f0 enp8s0f1
        bridge_stp off
        bridge_fd 0


Last week I bought new machine and wanted to change linux bridge to openvswitch to utilize both links for load balancing apart from redundancy.
This is mine new openvswitch config

Code:
auto lo
iface lo inet loopback

allow-vmbr1 bond0
iface bond0 inet manual
        ovs_bridge vmbr1
        ovs_type OVSBond
        ovs_bonds eno1 eno2
        pre-up ( ip link set dev eno1 mtu 1500 && ip link set dev eno2 mtu 1500 )
        ovs_options bond_mode=balance-slb vlan_mode=native-untagged
        mtu 1500

auto vmbr1
allow-ovs vmbr1
iface vmbr1 inet manual
        ovs_type OVSBridge
        ovs_ports bond0 vlan1 vlan81
        mtu 1500

allow-vmbr1 vlan1
iface vlan1 inet static
        ovs_type OVSIntPort
        ovs_bridge vmbr1
        ovs_options vlan_mode=access
        ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
        address 10.46.1.73
        netmask 255.255.255.0
        gateway 10.46.1.1
        mtu 1500
allow-vmbr1 vlan81
iface vlan81 inet static
        ovs_type OVSIntPort
        ovs_bridge vmbr1
        ovs_options tag=81
        ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
        address 10.46.81.73
        netmask 255.255.255.0
        mtu 1500

And it seems quite working apart from using vlans in vms or them losing connectivity after some time.
I tried disabling second switch so only one is used network wide but no success
I didn't wanted to install older kernel in order to get this working so I used bridge setting and it seems fine.
 
Hi.

The same problem after upgrade to PVE 6. No problems with OVS on PVE 5.x
Even try to install latest OVS 12.2 the same - my VM's lost connection after several seconds after start.
I'm haven't any Bonds, VLANs etc. Just simply OVS Bridge with network adapter.

Guys! What's wrong with OVS in PVE 6? It's a VERY big trouble.
 
Hi.

The same problem after upgrade to PVE 6. No problems with OVS on PVE 5.x
Even try to install latest OVS 12.2 the same - my VM's lost connection after several seconds after start.
I'm haven't any Bonds, VLANs etc. Just simply OVS Bridge with network adapter.

Guys! What's wrong with OVS in PVE 6? It's a VERY big trouble.

maybe can you try kernel 5.3 from pvetest repository ?
 
It's also available on the no-subscription repository, and works good here.
If you have the, production grade, enterprise repository enabled you may wait a bit and/or talk with the enterprise support, depending on your support subscription status.
 
Test repo? REALLY? Good joke. it's my PRODUCTION PVE's.
Sorry, you didn't mention that it was your production. (as you said "Even try to install latest OVS 12.2 the same", I was thinking it was a test server.)
I ask you about kernel 5.3, because some others users reported similar problem with ovs, and I don't think it's fixed.
If it's really critical, maybe can you rollback to linux bridge with vlan aware ? your network config seem simple.
 
Hi Guys,
I've installed Proxmox for the very first time today (v6.0). After the Installation, the Network adapter ist down. Looks like, it is completely deactivated. Booting another Linux distro from an USB Pendrive: no problem - the ethernet controller works; link is up and communication possible.

So the ethernet controller is not damaged, and cable and everything is okay. But as soon as Proxmox has boot up, the RJ45 port LEDs turn off. Link down.

Code:
# lspci | grep Ethernet
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09)

Is my problem related to this issue?
 
Hi Guys,
I've installed Proxmox for the very first time today (v6.0). After the Installation, the Network adapter ist down. Looks like, it is completely deactivated. Booting another Linux distro from an USB Pendrive: no problem - the ethernet controller works; link is up and communication possible.

So the ethernet controller is not damaged, and cable and everything is okay. But as soon as Proxmox has boot up, the RJ45 port LEDs turn off. Link down.

Code:
# lspci | grep Ethernet
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09)

Is my problem related to this issue?
Did you use the correct interface? Try to use other.
 
Is my problem related to this issue?

So it seems that your NIC is heavily proprietary and lots of models do not work with the in-kernel driver, thanks Realtek.

You could try to install the r8168-dkms package from the "non-free" Debian repository.
The biggest issue is probably that that's no to easy to do without internet in the first place.. :/
If you can get somehow an internet connection to this server over another link you could do it with
Code:
apt update
apt install pve-headers
apt install r8168-dkms

Did the link worked during installation? I.e., did you received a DHCP address on the Network option screen?

What Live Distribution did you boot-tested with a pendrive?
 
Hi Thomas,

I have tested:

Debian 7.0
Debian 8.0
Debian 9.0
Debian 10.0

(booted from USB stick). All versions work out of the box, using the r8169 driver. I made a real photo including your Post, to proof that this is true. The exact model number of my Ethernet controller is: Realtek RTL8111F

 
I have tested:

Debian 7.0
Debian 8.0
Debian 9.0
Debian 10.0

(booted from USB stick). All versions work out of the box, using the r8169 driver. I made a real photo including your Post, to proof that this is true. The exact model number of my Ethernet controller is: Realtek RTL8111F

Ok, thanks for your testing effort. I'll take a look if the Debian Kernel does something different regarding Realtek devices.
We're basing off the Ubuntu kernel, and honestly I'd had guessed if it does not work there it does not with Debian either (some kernel maintainer are both, Debian and Ubuntu ones, and Debian is normally more strict regarding proprietary HW)...

Can you post a
Code:
modinfo r8169
output of the Debian 10 one, if it's still available?

Also, you could try adding the "debug=16" option to that module, maybe we get some pointers from that output (reduce the number if it's to verbose)

either by:
Code:
rmmod r8169
modprobe r8169 debug=16

or adding a /etc/modprobe.d/realtek.conf with
Code:
options r8169 debug=16
 
Oh and, is the interface showing up at all in PVE:
Code:
ip -c addr

If so you could maybe try a:
Code:
ip link set dev <NIC> up
dhchlient -i <NIC>

and see if it gets an IP/comes up this way, then maybe we "only" have a configuration issue.
 
Did the link worked during installation? I.e., did you received a DHCP address on the Network option screen?

Yes, the Ethernet port is active; LED is blinking. After aborting the Proxmox Setup, i can also see that the Setup successfully obtained an Address via DHCP:

20191115_132251mxkro.jpg
 
Last edited: