SDN overlay network in routed mesh setup

fmaurer

New Member
Apr 25, 2025
16
11
3
Hello,
I have a three-node cluster with two rings.
1. One full ring between the three nodes. Similar to the configuration shown here:
https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Example


2. And one "uplink" ring of the "e0np0" interfaces as shown here (derived picture from the wiki-topology with the additional ring):

Code:
      (WAN) |       ┌───────────┐        | (WAN)
            |       │   Node2   │        |
            |       ├─────┬─────┤        |
            |     ┌─┤e0np0│e1np1├─┐      |
            |     | ├─────┬─────┤ |      |
            |     | │eno21│eno10│ |      |
            |     | └──┬──┴──┬──┘ |      |
            |     │    │     │    |      |
┌───────┬─────┐   │    │     │    │   ┌─────┬───────┐
│       │e0np0│   │    │     │    │   │e0np0│       │
│       ├─────┤   │    │     │    │   ├─────┤       │
│       │e1np1├───┘    │     │    └───┤e1np1|       │
│       ├─────┤        │     │        ├─────┤       │
│       │eno10├────────┘     └────────┤eno21│       │
│ Node1 ├─────┤                       ├─────┤ Node3 │
│       │eno21├───────────────────────┤eno10│       │
└───────┴─────┘                       └─────┴───────┘


Image with the full topology (forget about the IPMI network here)

1772102198870.png

So as ceph is working well with this setup, just as documented here:
https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Example

The question which is still open is:

How to manage the VM IP Addresses running on these hosts.
1. node1 and node3 have direct access to the uplink - can therefore put VMs into their vmbr0 and be done
2. node2 does not have direct access and must be routed via epyc03 and/or epyc01 - the VMs on this host should be able to use public IPs from the uplink broadcast domain.

How we currently solve this:
1. node03 puts e0np0 and e1np1 into vmbr0
2. node02 puts e1np1 into vmbr0
3. all VMs are on the respective vmbr0

Alternative option:
- just buy a switch and get proper uplink configured for node02 as well

I wonder what options are possible for such a setup. I prefer using onboard SDN EVPN stuff, but am also interested in other options and ideas for this problem.
 
If you're willing to use an EVPN zone, then you can designate the two nodes with the uplinks as exit nodes. They will then announce a default route to the node02 and all traffic from node02 will be routed towards them.
 
Hi @shanreich, this would require that I communicate BGP with the upstream router and completely change the setup in that direction, right?
node1 and node3 are currently in the same L2 as the upstream router (which directly routes the public IPv4 /27 network and allows the VMs to access it).

I wonder if there is an elegant solution which still allows to have the public IPs in the VMs (so no SNAT), but does not require a change to the router on the two WAN ports. I guess that's just not possible.
Some other posts mentioned ProxyARP - so that the exit nodes impersonate the IP of the VMs of node02 - but this would not solve the redundancy and does not look like a solid solution (https://wiki.debian.org/BridgeNetworkConnectionsProxyArp).
 
I wonder if there is an elegant solution which still allows to have the public IPs in the VMs (so no SNAT), but does not require a change to the router on the two WAN ports. I guess that's just not possible.

If you're already routing the /27 to the exit-nodes, then an entry in the default routing table of node 1/3 like:
Code:
192.0.2.0/27 dev vrf_myzone
should be sufficient to route the public subnet into the EVPN zone? This route should be automatically created on the exit nodes if you have the proper subnet configuration for your VNet.

The downside is that you'd need to 'sacrifice' one IP from that subnet as anycast GW inside the EVPN zone / vnet. Or am I misunderstanding something w.r.t your setup?
 
For the better understanding of my target, I can reduce the setup to a single node - typical home setup:

Current example setup:

- vmbr0: 192.168.178.250/24 - connected to WAN (gw: 192.168.178.1)
- VM100 has 192.168.178.100/24 configured and is also in vmbr0

Switched Setup to EVPN:

- the EVPN controller is created, the evpn Zone is created, the exit node is set, a VNet is created in the zone
- VM100 network interface is switched to the evpn vnet and still has 192.168.178.100 configured

How would the VM100 reach the network of the hypervisor (or 192.168.178.1)?

It should not matter if the VM is on a single node in the evpn vnet or in the above described three-node setup.
I would not like to have SNAT there, as the VM IP is from the same subnet as the hypervisor has.
If you're already routing the /27 to the exit-nodes
The problem is, that I am not routing the /27 and have the VM on the same L2 as the Hypervisor.
 
they are an option on the zone: "exit nodes local routing"
This setting does not help with the VM to reach the network of vmbr0 - it is just that the exit node itself can talk to the VM locally.

I think that it should be made clear, that EVPN can only be used if you talk BGP to some other side to announce a subnet which gets routed either one way or the other.
Unfortunately, I am missing this "other side speaking BGP", therefore it seems that EVPN is not an option for my specific problem (without changing the router).