pve host goes temporarily offline but VMs don't

thaf

New Member
Feb 9, 2023
4
1
3
I'm seeing an odd scenario play out on a test pve cluster. The three pve servers have about 50% packet loss, dropping off the net for minutes at a time, but all VMs on them remain available at all times with 0% packet loss.

Here's the network config from one of them:

Code:
auto lo
iface lo inet loopback

iface enp5s0f0 inet manual

iface eno1 inet manual

iface eno2 inet manual

iface eno3 inet manual

iface eno4 inet manual

iface enp5s0f1 inet manual

auto vmbr0
iface vmbr0 inet static
    address 10.194.128.211/21
    gateway 10.194.135.254
    bridge-ports enp5s0f0
    bridge-stp off
    bridge-fd 0

auto vlan50
iface vlan50 inet static
    address 10.202.74.241/24
    vlan-raw-device enp5s0f1

The logs are entirely devoid of anything resembling errors on both the server side and the switch side, a ten minute tcpdump showed nothing out of the ordinary (aside from no traffic to/from the bridge IP for a while), and an almost similar (no tagged VLAN on the corosync interface) setup behaves pretty much as I would expect this to do as well. The NICs are Intel X520 dual port.

I've tried with both 7.4 and 8.0 with no change.

Does anyone have a suggestion as to where I should look for some sort of next debug step?
 
Somewhat disturbingly, this problem went away with no changes performed. All it took was waiting about half a year...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!