Hey All,
I have a cluster of 6 servers (pve-manager/7.1-12/b3c09de3 (running kernel: 5.13.19-6-pve))
All servers are the same hardware. I have this really bizarre issue that smells like MTU, but I have had no luck working on a solution. This is not your typical network problem.
I have about 12 VMs on the cluster. These VMs are on various VLANs provided by some Edgecore 40/100G switches.
The issue I'm having is when any two VMs belonging to two different public VLANs try to establish any TCP connection to each other while running on the same PVE Node. I can migrate any VM to any PVE node and the issue no longer occurs. ICMP/UDP traffic is fine to these VMs. If I have two VMs on the same VLAN, on the same PVE node, there are no problems. You heard that right. Also, all Virtual Machines can use "The Internet" without any issue. For each of these VLANs, we use a Cisco Router as the L3 gateway. Traffic must leave the PVE node from the source bridge, then hit the external switches, reach the Cisco gateway, then return back to the PVE Node over a different physical connection and destination bridge.
I tried the following troubleshooting:
To be clear, the issue only occurs when traffic between two Virtual Machines crosses two different bridges on the same PVE Node. I can migrate a VM to any PVE Node to demonstrate the issue no longer occurs. I also worked with my network engineering group to verify switch interface stats, MTU, etc. and played with several settings to see if we could fix this.
side note: I have a PVE6.x cluster connected to the same switches and VLANs without this issue. Granted, the server hardware is not the same.
Basic Networking Info:
I have a cluster of 6 servers (pve-manager/7.1-12/b3c09de3 (running kernel: 5.13.19-6-pve))
All servers are the same hardware. I have this really bizarre issue that smells like MTU, but I have had no luck working on a solution. This is not your typical network problem.
I have about 12 VMs on the cluster. These VMs are on various VLANs provided by some Edgecore 40/100G switches.
The issue I'm having is when any two VMs belonging to two different public VLANs try to establish any TCP connection to each other while running on the same PVE Node. I can migrate any VM to any PVE node and the issue no longer occurs. ICMP/UDP traffic is fine to these VMs. If I have two VMs on the same VLAN, on the same PVE node, there are no problems. You heard that right. Also, all Virtual Machines can use "The Internet" without any issue. For each of these VLANs, we use a Cisco Router as the L3 gateway. Traffic must leave the PVE node from the source bridge, then hit the external switches, reach the Cisco gateway, then return back to the PVE Node over a different physical connection and destination bridge.
I tried the following troubleshooting:
- Disabled firewall for VM, Node, Datacenter
- Rebooted PVE Nodes
- Lowered PVE Node MTU on physical, bond, and bridge interfaces (then power cycled VMs) 1350
- Lowered Virtual Machine MTU 1250
- Observed TCP Retransmissions from tcpdump packet captures
- Tried different guest OS (Using Oracle Linux 8 mostly)
- Verified guest OS does not run firewall
- Tried several TCP connections (ssh, curl, mysql, etc.) - connection is created but no data is ever read
- Disabled GSO via ethtool on every physical/bond/vlan/bridge interface I could find.
To be clear, the issue only occurs when traffic between two Virtual Machines crosses two different bridges on the same PVE Node. I can migrate a VM to any PVE Node to demonstrate the issue no longer occurs. I also worked with my network engineering group to verify switch interface stats, MTU, etc. and played with several settings to see if we could fix this.
side note: I have a PVE6.x cluster connected to the same switches and VLANs without this issue. Granted, the server hardware is not the same.
Basic Networking Info:
Code:
auto lo
iface lo inet loopback
auto eno3
iface eno3 inet manual
up /sbin/ip link set $IFACE promisc on
iface eno4 inet manual
iface eno1 inet manual
iface eno2 inet manual
iface enp129s0f0 inet manual
iface enp129s0f1 inet manual
iface enp130s0f0 inet manual
iface enp130s0f1 inet manual
auto bond0
iface bond0 inet manual
bond-slaves eno1 enp130s0f1
bond-mode active-backup
auto bond1
iface bond1 inet manual
bond-slaves enp129s0f1 eno2
bond-mode active-backup
auto bond2
iface bond2 inet manual
bond-slaves enp130s0f0 enp129s0f0
bond-mode active-backup
auto bond1.55
iface bond1.55 inet manual
vlan-raw-device bond1
auto bond1.67
iface bond1.67 inet manual
vlan-raw-device bond1
auto bond1.68
iface bond1.68 inet manual
vlan-raw-device bond1
auto bond2.63
iface bond2.63 inet manual
vlan-raw-device bond2
auto bond2.64
iface bond2.64 inet manual
vlan-raw-device bond2
auto bond2.65
iface bond2.65 inet manual
vlan-raw-device bond2
auto vmbr55
iface vmbr55 inet manual
bridge-ports bond1.55
bridge-stp on
bridge-fd 0
#VLAN55
auto vmbr67
iface vmbr67 inet manual
bridge-ports bond1.67
bridge-stp on
bridge-fd 0
#VLAN67
auto vmbr68
iface vmbr68 inet manual
bridge-ports bond1.68
bridge-stp on
bridge-fd 0
#VLAN68
auto vmbr63
iface vmbr63 inet manual
bridge-ports bond2.63
bridge-stp on
bridge-fd 0
#VLAN63
auto vmbr64
iface vmbr64 inet manual
bridge-ports bond2.64
bridge-stp on
bridge-fd 0
#VLAN64
auto vmbr65
iface vmbr65 inet manual
bridge-ports bond2.65
bridge-stp on
bridge-fd 0
#VLAN65
auto vmbr66
iface vmbr66 inet static
address x.x.x.x/27
gateway x.x.x.y
bridge-vlan-aware no
bridge-ports bond0
bridge-stp on
bridge-fd 0
#VLAN66-MGMT-INSIDE
auto vmbr9999
iface vmbr9999 inet manual
bridge-ports eno3
bridge-stp off
bridge-fd 0
bridge-vlan-aware no
up /usr/sbin/brctl setageing vmbr9999 0
up /usr/sbin/brctl setfd vmbr9999 0
#SPAN-PORT-ENO3-BORDER0-RTR
auto vmbr9009
iface vmbr9009 inet static
address 10.9.0.1/24
bridge-ports none
bridge-stp on
bridge-fd 0
bridge-vlan-aware no
#INT-9009-TEST
auto vmbr9010
iface vmbr9010 inet static
address 10.10.0.1/24
bridge-ports none
bridge-stp on
bridge-fd 0
bridge-vlan-aware no
#INT-9010-TEST
pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-12 (running version: 7.1-12/b3c09de3)
pve-kernel-helper: 7.1-14
pve-kernel-5.13: 7.1-9
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-7
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-1
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-2
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.1.6-1
proxmox-backup-file-restore: 2.1.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-9
pve-cluster: 7.1-3
pve-container: 4.1-4
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-6
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.2.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
Code:
lshw -c network -businfo
Bus info Device Class Description
=============================================================
pci@0000:01:00.0 eno1 network 82599ES 10-Gigabit SFI/SFP+ Network Conn
pci@0000:01:00.1 eno2 network 82599ES 10-Gigabit SFI/SFP+ Network Conn
pci@0000:06:00.0 eno3 network I350 Gigabit Network Connection
pci@0000:06:00.1 eno4 network I350 Gigabit Network Connection
pci@0000:81:00.0 enp129s0f0 network Ethernet 10G 2P X520 Adapter
pci@0000:81:00.1 enp129s0f1 network Ethernet 10G 2P X520 Adapter
pci@0000:82:00.0 enp130s0f0 network Ethernet 10G 2P X520 Adapter
pci@0000:82:00.1 enp130s0f1 network Ethernet 10G 2P X520 Adapter
Code:
ethtool -i eno1
driver: ixgbe
version: 5.14.6
firmware-version: 0x8000095c, 19.5.12
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
ethtool -i enp129s0f0
driver: ixgbe
version: 5.14.6
firmware-version: 0x8000095d, 19.5.12
expansion-rom-version:
bus-info: 0000:81:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
ethtool -i enp130s0f0
driver: ixgbe
version: 5.14.6
firmware-version: 0x8000095d, 19.5.12
expansion-rom-version:
bus-info: 0000:82:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes