Hey! How are you people?
I was testing SDN fault tolerance and after a while struggling to make everything work as I intended, finally it's alive. But thing is that after testing a "node purposefully takedown" to try HA and see how much time it takes to reboot a vm and all of that, the node I've rebooted joined ok but SDN is logging this:
I'm able to ping all three nodes perfectly, no firewall rules bothering bgp... I don't know.
Thing is that to solve this, I must change the node IP, then adjust evpncontroller new members IP and then reload&apply SDN and after that it works. I'm sure that someone knows what happens here since i'm fairly new to proxmox and its SDN (its marvellous by the way, I'm not looking back).
Core is Mikrotik, Aggregation is a c7000 with bl460c gen9, VC flexfabric 10Gb/24.
I'm using balance-xor because I saw a few problems when enabled LACP but in the papers the flexfabric fully supports LACP.
MAC addresses being messed up maybe? I'm lost here.
Thanks in advance.
Edit: As you can see, no other devices apart from those three nodes are in that vlan, should I set up a router and assign ips using dhcp instead?
I was testing SDN fault tolerance and after a while struggling to make everything work as I intended, finally it's alive. But thing is that after testing a "node purposefully takedown" to try HA and see how much time it takes to reboot a vm and all of that, the node I've rebooted joined ok but SDN is logging this:
Code:
Apr 13 22:41:32 host3 bgpd[2806]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Apr 13 22:41:32 host3 bgpd[2806]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Apr 13 22:41:32 host3 bgpd[2806]: [H4B4J-DCW2R][EC 33554455] 10.10.1.12 [Error] bgp_read_packet error: Connection reset by peer
Apr 13 22:41:32 host3 bgpd[2806]: [H4B4J-DCW2R][EC 33554455] 10.10.1.11 [Error] bgp_read_packet error: Connection reset by peer
I'm able to ping all three nodes perfectly, no firewall rules bothering bgp... I don't know.
Thing is that to solve this, I must change the node IP, then adjust evpncontroller new members IP and then reload&apply SDN and after that it works. I'm sure that someone knows what happens here since i'm fairly new to proxmox and its SDN (its marvellous by the way, I'm not looking back).
Code:
root@host3:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
auto enp2s0f0
iface enp2s0f0 inet manual
mtu 9000
#Phy Dev 1
auto enp2s0f1
iface enp2s0f1 inet manual
mtu 9000
#Phy Dev 2
auto bond0
iface bond0 inet manual
bond-slaves enp2s0f0 enp2s0f1
bond-miimon 100
bond-mode balance-xor
bond-xmit-hash-policy layer2+3
mtu 9000
#Phy Bond XOR Hash2+3
auto bond0.10
iface bond0.10 inet static
address 10.10.1.13/24
mtu 2000
#VLAN SDN VLAN
auto vmbr0
iface vmbr0 inet static
address 16.100.15.35/16
gateway 16.100.1.2
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 11-4094
mtu 1500
#Management / OOB - VLAN1
auto brextnet2
iface brextnet2 inet manual
bridge-ports vlextnet2
bridge-stp off
bridge-fd 0
mtu 1500
#Gateway 2
auto brextnet3
iface brextnet3 inet manual
bridge-ports vlextnet3
bridge-stp off
bridge-fd 0
mtu 1500
#Gateway 3
auto brextnet5
iface brextnet5 inet manual
bridge-ports vlextnet5
bridge-stp off
bridge-fd 0
mtu 1500
#Gateway 5
auto brextnet1
iface brextnet1 inet manual
bridge-ports vlextnet1
bridge-stp off
bridge-fd 0
mtu 1500
#Gateway 1
auto vlextnet2
iface vlextnet2 inet manual
mtu 1500
vlan-id 1002
vlan-raw-device enp2s0f0
#VLAN1002 Gateway 2
auto vlextnet3
iface vlextnet3 inet manual
mtu 1500
vlan-id 1003
vlan-raw-device enp2s0f0
#VLAN1003 Gateway 3
auto vlextnet5
iface vlextnet5 inet manual
mtu 1500
vlan-id 1005
vlan-raw-device enp2s0f0
#VLAN1005 Gateway 5
auto vlextnet1
iface vlextnet1 inet manual
mtu 1500
vlan-id 1001
vlan-raw-device enp2s0f0
#VLAN1001 Gateway 1
source /etc/network/interfaces.d/*
Code:
root@host3:~# cat /etc/pve/sdn/*.cfg
evpn: evpnctlr
asn 65001
peers 10.10.1.11,10.10.1.12,10.10.1.13
subnet: evpn1-192.168.103.0-24
vnet ar0
subnet: evpn3-10.42.0.0-16
vnet ho1
subnet: evpn2-10.40.0.0-16
vnet vm1
subnet: evpn3-172.16.0.0-16
vnet rr1
subnet: evpn2-10.41.1.0-24
vnet as1
vnet: ar0
zone evpn1
tag 1001
vnet: vm1
zone evpn2
tag 1003
vnet: as1
zone evpn2
tag 1002
vnet: ho1
zone evpn3
tag 1004
vnet: rr1
zone evpn3
tag 1005
evpn: evpn1
controller evpnctlr
vrf-vxlan 101
disable-arp-nd-suppression 1
ipam pve
mac BC:24:11:13:BD:45
mtu 1500
evpn: evpn2
controller evpnctlr
vrf-vxlan 102
disable-arp-nd-suppression 1
ipam pve
mac BC:24:11:81:CF:70
mtu 1500
evpn: evpn3
controller evpnctlr
vrf-vxlan 103
disable-arp-nd-suppression 1
ipam pve
mac BC:24:11:6E:48:99
mtu 1500
Core is Mikrotik, Aggregation is a c7000 with bl460c gen9, VC flexfabric 10Gb/24.
I'm using balance-xor because I saw a few problems when enabled LACP but in the papers the flexfabric fully supports LACP.
MAC addresses being messed up maybe? I'm lost here.
Thanks in advance.
Edit: As you can see, no other devices apart from those three nodes are in that vlan, should I set up a router and assign ips using dhcp instead?
Last edited: