VMs loosing connectivity randomly

Feb 14, 2022
4
0
1
29
Hi there,

it's been a long time I'm facing the same issue on multiple hypervisors from different server providers (for over a year on my self hosted servers, at Caldera Park in Milan, too):
I'm currently having two hypervisors hosted by Hetzner,
one running, running version 6.3-6, VMs only where each VM has a dedicated public IP,
one running, running version 7.1-10, 1 VM (pfSense) with a dedicated public IP and some LXC container (connected to pfSense via virtual bridge)
on both hypervisors, in the same moment, for basically the same amount of time, all virtual servers (idk if only VMs or LXC containers too since I have them only trough pfSense) are loosing connection to the Internet.

I use UptimeRobot both to monitor websites and sending an heartbeat from the apps that are not accessible from the outside and I'm currently spammed of alerts because even if the server and the softwares are up they can't communicate their status.

This happens almost every day and there are moments where the only question is "WTF?"

During the entire experience I tried using virtio and e1000 network cards but didn't notice differences. In the Tasks list there is nothing near the "blackout" time

dmesg -wH shows that vmbr(s) are going in blocking state (same on both servers):
1645133489553.png

Let me know if there is any kind of log I can provide you to help understand the problem.

Thank you in advance,
Pietro
 
Last edited:
hi,

I'm currently having two hypervisors hosted by Hetzner,
one running, running version 6.3-6, VMs only where each VM has a dedicated public IP,
one running, running version 7.1-10, 1 VM (pfSense) with a dedicated public IP and some LXC container (connected to pfSense via virtual bridge)
on both hypervisors, in the same moment, for basically the same amount of time, all virtual servers (idk if only VMs or LXC containers too since I have them only trough pfSense) are loosing connection to the Internet.

* are these servers clustered together? (mixing PVE6 and PVE7 nodes in a cluster is not recommended!)
if they are you should really take a look at upgrading the PVE6 node to PVE7 [0]

shows that vmbr(s) are going in blocking state (same on both servers):
i would also take a look at the system logs

Let me know if there is any kind of log I can provide you to help understand the problem.
* pveversion -v from both servers
* /var/log/syslog from both servers (you can attach them here as files or upload it somewhere)
* lspci -nnk -v | grep -i net -C 2 (to check network adapter hardware and kernel modules on both servers)
* cat /etc/network/interfaces on both nodes
* do you have firewalls enabled on node or datacenter level?

[0]: https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!