I'm seeing an odd scenario play out on a test pve cluster. The three pve servers have about 50% packet loss, dropping off the net for minutes at a time, but all VMs on them remain available at all times with 0% packet loss.
Here's the network config from one of them:
The logs are entirely devoid of anything resembling errors on both the server side and the switch side, a ten minute tcpdump showed nothing out of the ordinary (aside from no traffic to/from the bridge IP for a while), and an almost similar (no tagged VLAN on the corosync interface) setup behaves pretty much as I would expect this to do as well. The NICs are Intel X520 dual port.
I've tried with both 7.4 and 8.0 with no change.
Does anyone have a suggestion as to where I should look for some sort of next debug step?
Here's the network config from one of them:
Code:
auto lo
iface lo inet loopback
iface enp5s0f0 inet manual
iface eno1 inet manual
iface eno2 inet manual
iface eno3 inet manual
iface eno4 inet manual
iface enp5s0f1 inet manual
auto vmbr0
iface vmbr0 inet static
address 10.194.128.211/21
gateway 10.194.135.254
bridge-ports enp5s0f0
bridge-stp off
bridge-fd 0
auto vlan50
iface vlan50 inet static
address 10.202.74.241/24
vlan-raw-device enp5s0f1
The logs are entirely devoid of anything resembling errors on both the server side and the switch side, a ten minute tcpdump showed nothing out of the ordinary (aside from no traffic to/from the bridge IP for a while), and an almost similar (no tagged VLAN on the corosync interface) setup behaves pretty much as I would expect this to do as well. The NICs are Intel X520 dual port.
I've tried with both 7.4 and 8.0 with no change.
Does anyone have a suggestion as to where I should look for some sort of next debug step?