proxmox 7.0 sdn beta test

tisc0 · Sep 2, 2021

I just got that the macadress is actually one of the interfaces of my main firewall, connecting the vxlan_temp1.

/etc/pve/sdn/*.cfg

vnet: dmz1
zone DMZ
tag 1

vnet: admin1
zone ADMIN
tag 2

vnet: temp1
zone TEMP
tag 3

vnet: sensibl1
zone SENSIBLE
tag 5

vnet: dmzxxx
zone DMZ
alias DMZ VXXXX
tag 6

vnet: ssbxxx
zone SENSIBLE
alias SENSIBLE XXXX
tag 7

vxlan: DMZ
peers 172.15.0.1,172.15.0.2,172.15.0.3,172.15.0.4
ipam pve
mtu 8950
nodes node1,node2,node3,node4

vxlan: ADMIN
peers 172.15.0.1,172.15.0.2,172.15.0.3,172.15.0.4
ipam pve
mtu 8950
nodes node1,node2,node3,node4

vxlan: TEMP
peers 172.15.0.1,172.15.0.2,172.15.0.3,172.15.0.4
ipam pve
mtu 8950
nodes node1,node2,node3,node4

vxlan: SENSIBLE
peers 172.15.0.1,172.15.0.2,172.15.0.3,172.15.0.4
ipam pve
mtu 8950
nodes node1,node2,node3,node4

Click to expand...

/etc/interfaces

auto lo
iface lo inet loopback

iface ens3f0 inet manual
#Public interface

iface enp0s20f0u8u3c2 inet manual # weird, right ?!

iface ens3f1 inet manual
mtu 9000
# LAN interface

auto vmbr0
iface vmbr0 inet static
address xx.xx.xx.xx/24
gateway xx.xx.xx.xx
bridge-ports ens3f0
bridge-stp off
bridge-fd 0
# Public bridge

auto ens3f1.2016
iface ens3f1.2016 inet static
address 10.xx.xx.xx/28
post-up ip route add 10.88.0.0/13 via 10.88.86.1 dev ens3f1.2016
# RPNv2-compatv1

auto ens3f1.1500
iface ens3f1.1500 inet static
address 172.15.0.2/28
# SDN

auto ens3f1.1600
iface ens3f1.1600 inet static
address 172.16.0.2/28
# migration

source /etc/network/interfaces.d/*

/etc/network/interfaces.d/sdn

#version:88

auto admin1
iface admin1
bridge_ports vxlan_admin1
bridge_stp off
bridge_fd 0
mtu 8950

auto dmz1
iface dmz1
bridge_ports vxlan_dmz1
bridge_stp off
bridge_fd 0
mtu 8950

auto dmzxxxx
iface dmzxxxx
bridge_ports vxlan_dmzxxxx
bridge_stp off
bridge_fd 0
mtu 8950
alias DMZ XXXX

auto sensibl1
iface sensibl1
bridge_ports vxlan_sensibl1
bridge_stp off
bridge_fd 0
mtu 8950

auto ssbxxxx
iface ssbxxxx
bridge_ports vxlan_ssbxxxx
bridge_stp off
bridge_fd 0
mtu 8950
alias SENSIBLE XXXX

auto temp1
iface temp1
address 10.25.220.11/16
bridge_ports vxlan_temp1
bridge_stp off
bridge_fd 0
mtu 8950

auto vxlan_admin1
iface vxlan_admin1
vxlan-id 2
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

auto vxlan_dmz1
iface vxlan_dmz1
vxlan-id 1
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

auto vxlan_dmzxxxx
iface vxlan_dmzxxxx
vxlan-id 6
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

auto vxlan_sensibl1
iface vxlan_sensibl1
vxlan-id 5
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

auto vxlan_ssbxxxx
iface vxlan_ssbxxxx
vxlan-id 7
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

auto vxlan_temp1
iface vxlan_temp1
vxlan-id 3
vxlan_remoteip 172.15.0.1
vxlan_remoteip 172.15.0.3
vxlan_remoteip 172.15.0.4
mtu 8950

Thanks !

tisc0 · Sep 2, 2021

Look, I just deactivated the vNIC of the VM, deleted the macaddress to get a new one, OK, then re-activated => no more bad log nor weird behavior.
Again, curious to understand what happenened

And again, thanks for your help !

edit: and actually the syndrom just came back on another mac address, another vlan, but on the same VM :

Sep 3 00:00:56 node_name kernel: [1423482.080938] vxlan_dmzxxxx: 66:45:62:2b:f5:3e migrated from 172.15.0.3 to 172.15.0.1

spirit · Sep 3, 2021

I just got that the macadress is actually one of the interfaces of my main firewall, connecting the vxlan_temp1.

is your firewall a vm plugged on a vxlan vnet ?

Code:

auto temp1
iface temp1
  address 10.25.220.11/16
  bridge_ports vxlan_temp1
  bridge_stp off
  bridge_fd 0
  mtu 8950

That's seem strange here, you shouldn't have any ip address on a vxlan vnet bridge. (it's only for layer2 tunnel, with no routing).
I have check the code, and I don't set any ip address on vnet when a vnet is in a vxlan zone.

tisc0 · Sep 3, 2021

is your firewall a vm plugged on a vxlan vnet ?

Yes, my firewall is a vm plugged on a multiple vxlan vnet.
I re-applied cluster wide the SDN conf, it worked very well almost all day, but not anymore since a few minutes, getting the migration log lines on 2 nodes now 0_o

is your firewall a vm plugged on a vxlan vnet ?

That's extremely strange indeed, I don't remember hacking manually this file !

It changed of vxlan in logs, and then came back on temp1. It's definitely linked with network instability in my VMs using my firewall as NAT gateway.

spirit · Sep 4, 2021

, getting the migration log lines on 2 nodes now 0_o

can you give me more details ???

the error message: <mac> migrated from node1 to node2 , mean than kernel is seeing mac address on another node. That's shouldn't happen of course if this a the mac address of a vm and the vm is not on this node.... until you have a network loop or something like that. (I don't known about your firewall, but verify that the firewall is not bridging 2 vxlan togethers at layer2 or something like that)

tisc0 · Sep 6, 2021

Hi @spirit
Thanks for your support. I really don't know what to think anymore

the error message: <mac> migrated from node1 to node2 , mean than kernel is seeing mac address on another node. That's shouldn't happen of course if this a the mac address of a vm and the vm is not on this node....

It changed again.
To rephrase it shortly : My VM (gateway) is on node1, and I get logs (on node3 this time, but there's randomness in this), saying one of the MAC of my VM (that is changing too, but rarely) is being migrated from node1 to node2 (those hasn't changed actually)!

Actually, I 'm not sure anymore I can relate this behavior in logs, and the network connectivity I get in some VMs using my gateway. Got strange things though, on that very interface (vxlan_temps1), which seems to appear only in VMs with centos.

Lost in translation,
Keep digging,
Amazing sky
(Haiku of hope)

Edit: bad log just appeared in node4

Sep 6 10:03:45 node4 kernel: [1724893.095762] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.3 to 172.15.0.1
Sep 6 10:03:45 node4 kernel: [1724893.170885] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.1 to 172.15.0.3
Sep 6 10:03:45 node4 kernel: [1724893.170897] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.3 to 172.15.0.3 # !!

spirit · Sep 6, 2021

tisc0 said:
Edit: bad log just appeared in node4
Sep 6 10:03:45 node4 kernel: [1724893.095762] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.3 to 172.15.0.1
Sep 6 10:03:45 node4 kernel: [1724893.170885] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.1 to 172.15.0.3
Sep 6 10:03:45 node4 kernel: [1724893.170897] vxlan_temp1: ce:00:27:c4:43:4f migrated from 172.15.0.3 to 172.15.0.3 # !!

Looking at kernel code
https://github.com/torvalds/linux/blob/master/drivers/net/vxlan.c

This message is when your current node (here node4), receive a packet with the mac address "ce:00:27:c4:43:4f ", from the vxlan endpoint (172.15.0.3) or (172.15.0.1).

Are you sure that you don't have any duplicate mac address in your vms ?

you can list mac address of each vnets with "bridge fdb show dev <yourvnet>" on each node. Just try to see if you don't have duplicate address.

spirit · Sep 6, 2021

can you give me result of "brctl show temp1" vnet , on the differents nodes ?

tisc0 · Sep 6, 2021

Hi @spirit,

Down below the outputs of the command you asked, but the problem got solved when, indeed, I renewed the MAC address of my VM connected to vxlan_temp1 (which I already did last week !).
The question is : how could it get duplicated again ?
=> As far as I remember, I've been playing this days with migrating this VM. Is it possible there is a bug in this action ?
I actually remember that when I played with it last time, the first time I tried, Proxmox WebUi failed, saying the VM were already running on the node I was aiming. The next times migrated the VM properly.

Since I'll do other tests like this tomorrow, where should I check in Proxmox, to see if some ghosts are in the shell ?

Thanks

Node w

root@www:~# brctl show temp1
bridge name bridge id STP enabled interfaces
temp1 8000.5644ba92f880 no fwpr148p0
fwpr168p0
tap166i1
tap171i1
vxlan_temp1
root@www:~# bridge fdb show dev temp1
33:33:00:00:00:01 self permanent
01:00:5e:00:00:6a self permanent
33:33:00:00:00:6a self permanent
01:00:5e:00:00:01 self permanent
33:33:ff:92:f8:80 self permanent
56:44:ba:92:f8:80 vlan 1 master temp1 permanent
56:44:ba:92:f8:80 master temp1 permanent

Node x

root@xxx:~# brctl show temp1
bridge name bridge id STP enabled interfaces
temp1 8000.52d1e18cae64 no fwpr108p0
fwpr173p0
tap107i1
tap162i1
tap167i1
tap997i0
vxlan_temp1
root@xxx:~# bridge fdb show dev temp1
33:33:00:00:00:01 self permanent
01:00:5e:00:00:6a self permanent
33:33:00:00:00:6a self permanent
01:00:5e:00:00:01 self permanent
33:33:ff:8c:ae:64 self permanent
52:d1:e1:8c:ae:64 vlan 1 master temp1 permanent
52:d1:e1:8c:ae:64 master temp1 permanent

Node y

root@yyy:~# brctl show temp1
bridge name bridge id STP enabled interfaces
temp1 8000.b241562cab15 no fwpr105p0
fwpr106p1
fwpr110p0
fwpr112p0
fwpr113p0
fwpr121p1
fwpr124p1
fwpr126p1
fwpr137p0
fwpr150p1
fwpr151p1
fwpr163p0
tap143i1
tap144i3
tap171i1
veth102i1
vxlan_temp1
root@yyy:~# bridge fdb show dev temp1
33:33:00:00:00:01 self permanent
01:00:5e:00:00:6a self permanent
33:33:00:00:00:6a self permanent
01:00:5e:00:00:01 self permanent
33:33:ff:2c:ab:15 self permanent
b2:41:56:2c:ab:15 vlan 1 master temp1 permanent
b2:41:56:2c:ab:15 master temp1 permanent

Node z

root@zzz:~# brctl show temp1
bridge name bridge id STP enabled interfaces
temp1 8000.d6807c7a11ca no fwpr101p1
fwpr111p1
fwpr114p0
fwpr115p0
fwpr116p0
fwpr138p1
fwpr140p0
fwpr141p1
fwpr142p1
fwpr152p0
fwpr153p0
fwpr157p1
fwpr164p1
fwpr256p1
tap100i0
tap145i0
tap145i1
tap165i0
tap8888i1
tap8889i0
veth999i0
vxlan_temp1
root@zzz:~# bridge fdb show dev temp1
33:33:00:00:00:01 self permanent
01:00:5e:00:00:6a self permanent
33:33:00:00:00:6a self permanent
01:00:5e:00:00:01 self permanent
33:33:ff:7a:11:ca self permanent
d6:80:7c:7a:11:ca vlan 1 master temp1 permanent
d6:80:7c:7a:11:ca master temp1 permanent

spirit · Sep 6, 2021

I actually remember that when I played with it last time, the first time I tried, Proxmox WebUi failed, saying the VM were already running on the node I was aiming. The next times migrated the VM properly.

Since I'll do other tests like this tomorrow, where should I check in Proxmox, to see if some ghosts are in the shell ?

mmmm... maybe a ghost qemu process... (it could be in sleep mode, but maybe mac is registered, I really don't knwon)
you could check with "ps -aux|grep kvm" , and check if you have same vm name on multiple nodes.

tisc0 · Sep 7, 2021

spirit said:
mmmm... maybe a ghost qemu process... (it could be in sleep mode, but maybe mac is registered, I really don't knwon)
you could check with "ps -aux|grep kvm" , and check if you have same vm name on multiple nodes.

Simply that... I'm really a proxmo newb33 !
Indeed, a process was up for that VM on 2 nodes !

FYI, it was related and solved the instabilities in my SDN network vxlan_temp1.
Thanks a lot for your help !

PS: all the interfaces should have been seen in logs mith migrating problems, no ? This morning, was another vxlan in logs, after updating the MacAddress of temp1 yesterday.

talos · Sep 26, 2021

I setup my new network today and i wanted to use vxlan at home to learn more about vxlan and Proxmox using SDN.

So i got a new switch and rebuild my network today, my configuration looks like this:

Aruba 2930F, default VLAN is jumbo with and mtu of 1550 and max-frame-size of 1568
Proxmox 7.0 with its raw interface and bridge interface configured with an mtu of 1550. This interface is connected with 10g to my switch in my Default VLAN.

Communication between Proxmox and my Switch works fine. I setup one VXLAN connection between Proxmox and my Switch and set it on Proxmox with an MTU of 1500. The Switch automaticly show this VXLAN Tunnel with an MTU of 1500 as is is calculated from the MTU on the default VLAN (which is the tunnel source address). After testing a bit I found that the MTU in my VM or CT is not fully usable, a ping with do not fragment caps around 1458 Bytes but it should be 1472 bytes (all IPv4). Trying to do an apt update from my ct or VM fails because packet sizes are getting to big. Lowering the MTU in my CT/VM to around 1460~1480 fixes this issue but is not my goal because my internet connection has an mtu of 1500 and i like to keep it this way in my CT and VMs.

Does anyone has some experience with this? is anyone running Proxmox with an VXLAN MTU bigger thatn 1450 Bytes?

spirit · Sep 27, 2021

talos said:
I setup my new network today and i wanted to use vxlan at home to learn more about vxlan and Proxmox using SDN.

So i got a new switch and rebuild my network today, my configuration looks like this:

Aruba 2930F, default VLAN is jumbo with and mtu of 1550 and max-frame-size of 1568
Proxmox 7.0 with its raw interface and bridge interface configured with an mtu of 1550. This interface is connected with 10g to my switch in my Default VLAN.

Communication between Proxmox and my Switch works fine. I setup one VXLAN connection between Proxmox and my Switch and set it on Proxmox with an MTU of 1500. The Switch automaticly show this VXLAN Tunnel with an MTU of 1500 as is is calculated from the MTU on the default VLAN (which is the tunnel source address). After testing a bit I found that the MTU in my VM or CT is not fully usable, a ping with do not fragment caps around 1458 Bytes but it should be 1472 bytes (all IPv4). Trying to do an apt update from my ct or VM fails because packet sizes are getting to big. Lowering the MTU in my CT/VM to around 1460~1480 fixes this issue but is not my goal because my internet connection has an mtu of 1500 and i like to keep it this way in my CT and VMs.

Does anyone has some experience with this? is anyone running Proxmox with an VXLAN MTU bigger thatn 1450 Bytes?

you need to configure your proxmox physical interfaces with mtu 1550 too. (and any device in front of the vxlan interface).

(I'm running vm in vxlan with mtu 1500, and also 9000 without problem

can you provide your /etc/network/interfaces ?

Nascire · Sep 27, 2021

Last week I tried out EVPN, and had a few problems - are there still open issues/problems?

2 nodes, both with latest packages (99-really7.4), configuration from this thread (exitnodes-local-routing) and (don´t exactly remember if already included in packages) GIT version of the SDN perl files - both defined as exit-nodes
Container on Node 1 unable to reach Container on Node 2 (and vice versa), unless I restart FRR on both nodes
Nodes only able to reach their local containers (although the GIT perl files should account for that with the 10.255.255.1/2 interfaces?)

For now I´m using another solution to proceed with my PoC, but can rebuild the EVPN one if everything should be working.

spirit · Sep 27, 2021

Nascire said:
Last week I tried out EVPN, and had a few problems - are there still open issues/problems?

2 nodes, both with latest packages (99-really7.4), configuration from this thread (exitnodes-local-routing) and (don´t exactly remember if already included in packages) GIT version of the SDN perl files - both defined as exit-nodes

note that my last patches : exitnodes-local-routing is not released yet in offical .deb , it's only in git currently.

Nascire said:
Container on Node 1 unable to reach Container on Node 2 (and vice versa), unless I restart FRR on both nodes

mmm, this is strange, it should works out of the box. if you use 99-really7.4. (previous versions was buggy, note that upgrade of frr the package, don't auto-restart frr, you need to do it manually after each upgrade)

Nascire said:
Nodes only able to reach their local containers (although the GIT perl files should account for that with the 10.255.255.1/2 interfaces?)

Exit-nodes + "exitnode-local-routing" , should be able to reach any containers on any node from the exit node itself.

For non-exitnodes, if you want to reach containers, you need to add a route to go through the exit-node.

(ex: pvehost1: 192.168.0.1 exitnode : 192.168.0.254 containers network : 10.0.0.0/8 , you need to add a route on pvehost1 : route add 10.0.0.0/8 gw 192.168.0.254)

ricou · Nov 9, 2021

hello there.

i spent some time on the sdn plugin for a vlan implementation without reachint to make think working. (may be 4 days)

It's for a homelab (+ affinities when i will get a Netfiber) , so node will not be exactly the same (RAM, CPU and network interface count)
My context:
- 3 nodes cluster (essos, westeros, sothoryos)
- 1 manageable switch
- 1 G interface per node for promox cluster traffic.
- 10g interface on essos and sothoryos for CEPH and ISCSI
- 3*1g interface bond on westeros for CEPH and ISCSI
- 3*1g interface on essos and sothoryos for VM production traffic.
- 3*1g interface on u synology NAS as iscsi target (ulthos)

I named my vmbridge vmbr10 on all nodes. The physical interface are connected to the the switch with the corresponding configuration about bonding, trunk mode and vlan propagation.

My purpose :
define vlan on datacenter level, without having to define on each node. not managing vlan id on vm level. The choosen net (bridge) configuration will make them pingable each other and magic happened

Is this the purpose of the sdn plugin or i misunderstood?

The problem:
Vm on the same Vnet don't ping each other when they are on different node. When they are on the same node everything work fine.

is thetre someone that can help me please?

spirit · Nov 9, 2021

ricou said:
My purpose :
define vlan on datacenter level, without having to define on each node. not managing vlan id on vm level. The choosen net (bridge) configuration will make them pingable each other and magic happened

Is this the purpose of the sdn plugin or i misunderstood?

yes, that's correct.

ricou said:
The problem:
Vm on the same Vnet don't ping each other when they are on different node. When they are on the same node everything work fine.

is thetre someone that can help me please?

can you send /etc/pve/sdn/*.cfg files && 1 node /etc/network/interfaces ?

ricou · Nov 10, 2021

spirit said:
yes, that's correct.

can you send /etc/pve/sdn/*.cfg files && 1 node /etc/network/interfaces ?

Hello thank you spirit for you help and quick answer. i'm sure it's a stupid thing that i did not understand.

spirit · Nov 10, 2021

ricou said:
Hello thank you spirit for you help and quick answer. i'm sure it's a stupid thing that i did not understand.

remove vlan-aware option on vnet.
This option is only used for user wanting another extra tag at vm level (double vlan tag).
(I think I should put it in avanced option in the gui, as it's confusing)

ricou · Nov 10, 2021

spirit said:
remove vlan-aware option on vnet.
This option is only used for user wanting another extra tag at vm level (double vlan tag).
(I think I should put it in avanced option in the gui, as it's confusing)

it does the trick thank you spirit.
and yes this is quite bit confusing. is it used for multitenancy?

proxmox 7.0 sdn beta test

Active Member

Active Member

Distinguished Member

Active Member

Distinguished Member

Active Member

Distinguished Member

Distinguished Member

Active Member

Distinguished Member

Active Member

Renowned Member

Distinguished Member

Renowned Member

Distinguished Member

New Member

Distinguished Member

New Member

Attachments

Distinguished Member

New Member

We value your privacy