Network suddenly drops

Sergio Fernandez · Feb 7, 2019

Hi all,

I'm facing a strange behaviour in my office proxmox server. Periodically (sometimes every month, sometimes with days of difference), the network interfaces gets down and I lost connectivity, both the host and VMS. Rebooting the host solves the issue until the next time.
I've been mad looking for error mesages in all logs but nothing, also checked other network elements with no luck.
My server has two bridges. One of them has a bond with two nics to a switch using LACP. The other has one NIC directly o our ISP device for Internet connectivity. I thought that can be some issue with the LACP, but in fact both bridges gets down, sometimes at the same time, other not.

My environment is:

Code:

root@multivac:/var/log# pveversion --verbose
proxmox-ve: 5.1-32 (running kernel: 4.13.13-2-pve)
pve-manager: 5.1-41 (running version: 5.1-41/0b958203)
pve-kernel-4.13.13-2-pve: 4.13.13-32
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-18
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-5
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9

My network interfaces config is:

Code:

root@multivac:/var/log# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage part of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eno2
iface eno2 inet static
    address  172.22.1.5
    netmask  255.255.255.0
    gateway  172.22.1.1
#Management NIC

iface eno1 inet manual

auto enp6s0f0
iface enp6s0f0 inet manual

auto enp6s0f1
iface enp6s0f1 inet manual

auto bond0
iface bond0 inet manual
    slaves enp6s0f0 enp6s0f1
    bond_miimon 100
    bond_mode 802.3ad
    bond_xmit_hash_policy layer2+3
#General LACP Bond for VMs

auto vmbr1
iface vmbr1 inet manual
    bridge_ports eno1
    bridge_stp off
    bridge_fd 0
#Internet Access for pfsense

auto vmbr2
iface vmbr2 inet manual
    bridge_ports bond0
    bridge_stp on
    bridge_fd 0
    bridge_vlan_aware yes
#VM General Purpose Bridge

Do you guys know where can I look for more logging? Somebody facing similar issue?
THank you very much.

wolfgang · Feb 8, 2019

Hi,

your installation is quite old please update to current version.
What you write sounds like a kernel problem, so the only way get rid of it is to update your system.
Or you switch has a problem with LACP, check also if new firmware is available.

Sergio Fernandez · Feb 8, 2019

Hi @wolfgang, I've discarded LACP problem because sometimes only the network interface which is not attached to that bond ges down, but I'll review it again. Also I'll try upgrading next week and give you feedback. Thanks for your help.

Sergio Fernandez · Feb 15, 2019

Hi @wolfgang. Finally I was able to upgrade to 5.3 succesfully. Five minutes after reboot and VMs started working, the network went down, but this time I had a kernel log telling me there was some addresses mess in the bond where the LACP is, which pointed me to the problem.

After trying different LACP setups and configurations, seems my switch in fact has any problem with LACP and there is no upgrade available, I wasn't able to make it running.

Finally, I've drifted to a single port setup with manual failover (as this is not a critical service) which is working nice and reliable.
Thank you very much!

Search

Search

Network suddenly drops

Sergio Fernandez

New Member

wolfgang

Proxmox Retired Staff

Sergio Fernandez

New Member

Sergio Fernandez

New Member