After recently upgrading to the latest version we started seeing these errors in the kernel on a few nodes.
We are using openvswitch, the only thing I found using google that might explain the problem is this:
https://lkml.org/lkml/2020/8/10/522
Before the update we were running kernel 5.4.41-1-pve and did not have this problem.
All servers have the same network card, an Intel X520-DA2, and use the same configuration.
We use vlan tags for guests.
The only place I have noticed network issues is between Proxmox hosts themselves.
For example when storage replication runs I see numerous errors and when using Proxmox web interface sometimes we get timeouts
pveversion -v:
Example errors:
cat /etc/network/interfaces
We are using openvswitch, the only thing I found using google that might explain the problem is this:
https://lkml.org/lkml/2020/8/10/522
Before the update we were running kernel 5.4.41-1-pve and did not have this problem.
All servers have the same network card, an Intel X520-DA2, and use the same configuration.
We use vlan tags for guests.
The only place I have noticed network issues is between Proxmox hosts themselves.
For example when storage replication runs I see numerous errors and when using Proxmox web interface sometimes we get timeouts
pveversion -v:
Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.41-1-pve: 5.4.41-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.18-2-pve: 5.3.18-2
pve-kernel-4.15.18-26-pve: 4.15.18-54
pve-kernel-4.13.13-4-pve: 4.13.13-35
pve-kernel-4.13.13-1-pve: 4.13.13-31
pve-kernel-4.13.8-3-pve: 4.13.8-30
pve-kernel-4.13.8-2-pve: 4.13.8-28
pve-kernel-4.10.15-1-pve: 4.10.15-15
pve-kernel-4.10.11-1-pve: 4.10.11-9
pve-kernel-4.10.8-1-pve: 4.10.8-7
pve-kernel-4.10.5-1-pve: 4.10.5-5
pve-kernel-4.10.1-2-pve: 4.10.1-2
ceph: 14.2.16-pve1
ceph-fuse: 14.2.16-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
pve-zsync: 2.0-4
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
Example errors:
Code:
Jan 26 11:46:58 vm2 kernel: [785863.008555] ens3f1: dropped over-mtu packet: 1501 > 1500
Jan 26 11:47:06 vm2 kernel: [785871.456310] ens3f1: dropped over-mtu packet: 1501 > 1500
Jan 26 11:47:23 vm2 kernel: [785888.096369] ens3f1: dropped over-mtu packet: 1501 > 1500
Jan 26 11:47:58 vm2 kernel: [785922.911812] ens3f1: dropped over-mtu packet: 1501 > 1500
Jan 25 14:59:34 vm20 kernel: [712114.934150] enp5s0f0: dropped over-mtu packet: 1501 > 1500
Jan 25 15:53:16 vm20 kernel: [715336.654925] enp5s0f0: dropped over-mtu packet: 1504 > 1500
Jan 25 16:26:29 vm20 kernel: [717329.730660] enp5s0f0: dropped over-mtu packet: 1507 > 1500
Jan 25 16:26:37 vm20 kernel: [717337.770746] enp5s0f0: dropped over-mtu packet: 1502 > 1500
Jan 25 16:48:30 vm20 kernel: [718650.543170] enp5s0f0: dropped over-mtu packet: 1505 > 1500
Jan 25 17:29:17 vm20 kernel: [721097.347615] enp5s0f1: dropped over-mtu packet: 1505 > 1500
Jan 25 17:29:18 vm20 kernel: [721098.484832] enp5s0f1: dropped over-mtu packet: 1505 > 1500
Jan 25 17:45:52 vm20 kernel: [722091.989953] enp5s0f1: dropped over-mtu packet: 1508 > 1500
cat /etc/network/interfaces
Code:
auto lo
iface lo inet loopback
allow-vmbr0 ens3f0
iface ens3f0 inet manual
allow-vmbr0 ens3f1
iface ens3f1 inet manual
allow-vmbr0 bond0
iface bond0 inet manual
ovs_bridge vmbr0
ovs_type OVSBond
ovs_bonds ens3f0 ens3f1
ovs_options bond_mode=balance-tcp lacp=active other_config:lacp-time=fast
allow-ovs vmbr0
iface vmbr0 inet manual
ovs_type OVSBridge
ovs_ports bond0 vlan9 vlan6 vlan7
# mgmt lan
allow-vmbr0 vlan9
iface vlan9 inet static
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=9
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
address x.x.x.x
netmask 255.255.255.0
gateway x.x.x.x
#corosync lan
allow-vmbr0 vlan6
iface vlan6 inet static
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=6
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
address y.y.y.y
netmask 255.255.255.0
#cluster migration lan
allow-vmbr0 vlan7
iface vlan7 inet static
ovs_type OVSIntPort
ovs_bridge vmbr0
ovs_options tag=7
ovs_extra set interface ${IFACE} external-ids:iface-id=$(hostname -s)-${IFACE}-vif
address z.z.z.z
netmask 255.255.255.0