I'm stumped.
I'm migrating my setup to SDN with VLAN zone over vmbr4 with bridge port bond0.2920 VLAN. I didn't touch network config for VMs and CTs except changing interface. At first glanced everything seemed to work. Pinging working correctly, internet working fine etc. After a while I've noticed that I can't access some of my apps through proxy. It spins until timeout. Looking into more detail the initial html gets loaded but the subsequent requests to script and styles are stuck pending. After more in depth testing it seems that on some nodes it does not work and on some it does. But after some migrating around and restart it stopped working one some that it previously worked and started where it didn't. Which was weird.
You might think I have duplicate IPs on the network but I've checked and shut down everything except 1 CT and 2 VMs and they have different IPs. Since ping works always regardless of on what nodes the VMs and CT are, I tried connecting with SSH. I've tried all the combinations of location of the VMs and CT and the results are:
ssh VM1 -> CT = sometimes works
ssh VM2 -> CT = sometimes works
ssh CT -> VM1 = sometimes works
ssh CT -> VM2 = sometimes works
ssh VM1 -> VM2 = sometimes works
ssh VM2 -> VM1 = sometimes works
Go figure. I've repeated the testes with migrations and most of the time everything works when the VMs/Ct are on nodes 1,2,3 but most of the time it does not work when on nodes 4,5,6. When the VMs/CT are on the same node everything works fine and since everything worked before so I'm thinking it must be SDN.
But it is a weird issue that ping works but SSH does not. Below is the result when trying to connect with ssh. The connection stops in the middle of connection.
I've checked all the network configs and they look fine, and since I can ping everything Im guessing they are fine. So what could be the problem? What can i do to better troubleshoot this?
Nodes 1,2,3 are Dell R320 and nodes 4,5,6 are R620. Below are network configs for Node 1. Configs are the same and the only difference that I can see is the IP on the VLANs and bridge. I can post configs for all nodes if required.
I'm migrating my setup to SDN with VLAN zone over vmbr4 with bridge port bond0.2920 VLAN. I didn't touch network config for VMs and CTs except changing interface. At first glanced everything seemed to work. Pinging working correctly, internet working fine etc. After a while I've noticed that I can't access some of my apps through proxy. It spins until timeout. Looking into more detail the initial html gets loaded but the subsequent requests to script and styles are stuck pending. After more in depth testing it seems that on some nodes it does not work and on some it does. But after some migrating around and restart it stopped working one some that it previously worked and started where it didn't. Which was weird.
You might think I have duplicate IPs on the network but I've checked and shut down everything except 1 CT and 2 VMs and they have different IPs. Since ping works always regardless of on what nodes the VMs and CT are, I tried connecting with SSH. I've tried all the combinations of location of the VMs and CT and the results are:
ssh VM1 -> CT = sometimes works
ssh VM2 -> CT = sometimes works
ssh CT -> VM1 = sometimes works
ssh CT -> VM2 = sometimes works
ssh VM1 -> VM2 = sometimes works
ssh VM2 -> VM1 = sometimes works
Go figure. I've repeated the testes with migrations and most of the time everything works when the VMs/Ct are on nodes 1,2,3 but most of the time it does not work when on nodes 4,5,6. When the VMs/CT are on the same node everything works fine and since everything worked before so I'm thinking it must be SDN.
But it is a weird issue that ping works but SSH does not. Below is the result when trying to connect with ssh. The connection stops in the middle of connection.
Code:
OpenSSH_8.4p1 Debian-5, OpenSSL 1.1.1n 15 Mar 2022
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug1: Connecting to 10.27.30.20 [10.27.30.20] port 22.
debug1: Connection established.
debug1: identity file /root/.ssh/id_rsa type -1
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa_sk type -1
debug1: identity file /root/.ssh/id_ecdsa_sk-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: identity file /root/.ssh/id_ed25519_sk type -1
debug1: identity file /root/.ssh/id_ed25519_sk-cert type -1
debug1: identity file /root/.ssh/id_xmss type -1
debug1: identity file /root/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.4p1 Debian-5
debug1: Remote protocol version 2.0, remote software version OpenSSH_8.9p1 Ubuntu-3ubuntu0.3
debug1: match: OpenSSH_8.9p1 Ubuntu-3ubuntu0.3 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 10.27.30.20:22 as 'root'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
I've checked all the network configs and they look fine, and since I can ping everything Im guessing they are fine. So what could be the problem? What can i do to better troubleshoot this?
Nodes 1,2,3 are Dell R320 and nodes 4,5,6 are R620. Below are network configs for Node 1. Configs are the same and the only difference that I can see is the IP on the VLANs and bridge. I can post configs for all nodes if required.
cat /etc/network/interfaces
Code:
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
auto enp10s0f0np0
iface enp10s0f0np0 inet manual
auto enp10s0f1np1
iface enp10s0f1np1 inet manual
auto eno1.2911
iface eno1.2911 inet manual
#Proxmox Cluster
auto bond0
iface bond0 inet manual
bond-slaves enp10s0f0np0 enp10s0f1np1
bond-miimon 100
bond-mode 802.3ad
#10G-BOND
auto bond0.2912
iface bond0.2912 inet manual
#CEPH Public
auto bond0.2913
iface bond0.2913 inet static
address 172.29.13.101/24
#CEPH Cluster
auto bond0.2914
iface bond0.2914 inet static
address 172.29.14.101/24
#PROXMOX_MIGRATION
auto bond0.2916
iface bond0.2916 inet manual
#Proxmox VM
auto bond0.2920
iface bond0.2920 inet manual
#sdn
auto bond0.100
iface bond0.100 inet manual
#isp
auto bond0.2917
iface bond0.2917 inet manual
#lan
auto vmbr0
iface vmbr0 inet static
address 172.29.11.101/24
gateway 172.29.11.1
bridge-ports eno1.2911
bridge-stp off
bridge-fd 0
#Proxmox Cluster
auto vmbr1
iface vmbr1 inet manual
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#LEGACY_VM
auto vmbr2
iface vmbr2 inet static
address 172.29.12.101/24
bridge-ports bond0.2912
bridge-stp off
bridge-fd 0
#CEPH Public
auto vmbr3
iface vmbr3 inet manual
bridge-ports bond0.2916
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#Proxmox VM
auto vmbr4
iface vmbr4 inet manual
bridge-ports bond0.2920
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#sdn
auto vmbr5
iface vmbr5 inet manual
bridge-ports bond0.100
bridge-stp off
bridge-fd 0
#isp
auto vmbr6
iface vmbr6 inet manual
bridge-ports bond0.2917
bridge-stp off
bridge-fd 0
#lan
source /etc/network/interfaces.d/*
cat /etc/pve/sdn/zones.cfg
Code:
vlan: vlanx
bridge vmbr4
ipam pve
nodes pve03,pve06,pve07,pve01,pve02,pve05
cat /etc/pve/sdn/vnets.cfg
Code:
vnet: home
zone vlanx
alias Home network
tag 2730
cat /etc/network/interfaces.d/sdn
Code:
#version:30
auto home
iface home
bridge_ports vmbr4.2730
bridge_stp off
bridge_fd 0
alias Home network