evpn? network segmentation?

ze42

New Member
Feb 16, 2026
11
0
1
Hello,

I'm trying to figure out what EVPN really is, and how it can help with network segmentation.


What I already managed to do :
* Use VXLAN between my nodes to have VM in dedicaded vnet communicate
* Have gateways in VM to nicely split firewall, and completly independant routing tables
* Proxmox just has bridges, and no required knowledge of the IP within the VM and their networks
* Proxmox CAN know about it, and still have some firewall on the links, much like SecurityGroups applied on interfaces on AWS

What i'm trying to figure is :
* Is it possible to use Proxmox directly as a router, with different routing tables?

Like having multiple zones that have completly separated independant routing tables.
* Using proxmox as router between the networks of each zone, with possibly overlapping networks
* Easy firewall to configure within that zone
* Have one of the zone able to route to an external default gateway, that might not be the same as for the node itself (ie: keeping the node in a dedicated internal zone)
 
Hello.

Are you trying to build something similar to an AWS VPC multi-tenant architecture, with a VRF per zone? Or NSX like?

You can build a PoC, but if your goal is true VRF-based routing separation (independent routing tables, overlapping IP ranges), then yes
You will need manual Linux VRF configuration and possibly FRR.

Proxmox SDN alone does not provide native VRF-per-tenant abstraction.

What is missing natively in Proxmox :
  • No native VRF-per-zone (no separate routing table per tenant in the GUI)
  • No built-in tenant-aware routing domains
  • No clean support for overlapping IP ranges without manual VRF setup
  • No VRF-aware firewall
  • No cloud-style multi-tenant network control-plane (like AWS VPC / NSX VMware / OpenStack)
  • Proxmox SDN provides L2/L3 segmentation, but does not provide full VRF-based tenant routing isolation.

 
Ok, so without native support, what I had in mind at start, from what I see now...

Within a single Proxmox:
* create a namespace for each dedicated "zone"
* create bridges, firewall rules, ... in that namespace
* create veth, one in the "zone" namespace, and one in the default namespace
* In proxmox, create the wanted bridges. Manually add the veth from default namespace in it
* Place the VM on that bridge in default namespace

Proxmox default namespace will just use the bridge feature to let the traffic flow to the "zone" namespace
The "zone" namespace will handle firewalling and routing that will have been manually configured.

Not sure how to scale that to a cluster, but at least I get the idea of what is doable.


What about EVPN ?
Is there any simple example of what it really does, how to set it up, ... to get a better understanding on what it can do on a simple single cluster setup ?
 
What about EVPN ?
Is there any simple example of what it really does, how to set it up, ... to get a better understanding on what it can do on a simple single cluster setup ?

Without EVPN:
  • A VLAN exists only locally on each node.
  • If a VM moves to another node, you must ensure the VLAN exists and is correctly trunked everywhere.
  • L2 domains depend entirely on physical switch configuration.

With EVPN (VXLAN + BGP):
  • Each node becomes part of an overlay network.
  • A VNet (VNI) is distributed across all nodes.
  • MAC/IP information is exchanged dynamically via BGP.
  • The L2 domain is stretched across the cluster using VXLAN

Simple firewall example


Imagine a 3-node cluster with EVPN enabled.

You create:
- An EVPN zone
- A BGP controller (EVPN control-plane)
- A VNet (e.g., VNI 10050)

Deploy a firewall VM:
  • Interface 1 → vmbr0 (WAN / uplink)
  • Interface 2 → EVPN VNet (LAN)
Now:
  • A VM on node 3 connected to the EVPN VNet
  • Can reach the firewall running on node 1
  • Even though they are on different physical hosts
Why?
Because:
  • The firewall’s MAC/IP is advertised via BGP EVPN
  • Traffic is encapsulated using VXLAN between nodes
  • Reduced dependency on physical VLAN trunk configuration across switches

You can also add CARP (open-source firewalls) to provide L3 redundancy between two firewall VMs.

For example:
  • Firewall 1 on node 1
  • Firewall 2 on node 2
  • Both connected to the same EVPN VNet (LAN)
If one node crashes:
  • EVPN still provides the distributed L2 network across the remaining nodes as long as underlay IP connectivity remains available
    The secondary firewall becomes active via CARP
  • The virtual IP remains reachable
  • VMs keep connectivity without changing their gateway



    This works very well in practice.

    What you described with namespaces is conceptually similar, but limited to a single node. EVPN provides a distributed control-plane (BGP) and scales that concept across all nodes in the cluster.
 
Without EVPN:
  • A VLAN exists only locally on each node.
  • If a VM moves to another node, you must ensure the VLAN exists and is correctly trunked everywhere.
  • L2 domains depend entirely on physical switch configuration.
When you say without EVPN, in fact you mean without VXLAN.

VXLAN would be enough to have L2 extend to the whole cluster, EVPN is not required to do that.


With EVPN (VXLAN + BGP):
  • Each node becomes part of an overlay network.
  • A VNet (VNI) is distributed across all nodes.
  • MAC/IP information is exchanged dynamically via BGP.
  • The L2 domain is stretched across the cluster using VXLAN
Only part really related to EVPN vs VXLAN is the MAC/IP exchanged via BGP, rather than letting the broadcast do its work and discover them like on a normal switch.

Simple firewall example


Imagine a 3-node cluster with EVPN enabled.

You create:
- An EVPN zone
- A BGP controller (EVPN control-plane)
- A VNet (e.g., VNI 10050)

Deploy a firewall VM:
  • Interface 1 → vmbr0 (WAN / uplink)
  • Interface 2 → EVPN VNet (LAN)
Now:
  • A VM on node 3 connected to the EVPN VNet
  • Can reach the firewall running on node 1
  • Even though they are on different physical hosts
Why?
Because:
  • The firewall’s MAC/IP is advertised via BGP EVPN
  • Traffic is encapsulated using VXLAN between nodes
  • Reduced dependency on physical VLAN trunk configuration across switches

You can also add CARP (open-source firewalls) to provide L3 redundancy between two firewall VMs.

For example:
  • Firewall 1 on node 1
  • Firewall 2 on node 2
  • Both connected to the same EVPN VNet (LAN)
If one node crashes:
  • EVPN still provides the distributed L2 network across the remaining nodes as long as underlay IP connectivity remains available
    The secondary firewall becomes active via CARP
  • The virtual IP remains reachable
  • VMs keep connectivity without changing their gateway



    This works very well in practice.

    What you described with namespaces is conceptually similar, but limited to a single node. EVPN provides a distributed control-plane (BGP) and scales that concept across all nodes in the cluster.

Pour example would work well with just VXLAN. The hypervisor does not need to know anything about the internal adresses, and EVPN/BGP only helps a little about routing/discovery, rather than being over broadcast.

I really don't see how L2 propagated with VXLAN without EVPN would really differ from the suggested solution.



From further testing...

Each EVPN zones allows to have completly independant routing tables, and still have proxmox run as anycast gateway for each of the subnet. Allowing them to talk to each other, only using the anycast gateway (local proxmox).

Lets say within a zone, we have VM-A1, VM-A2 on subnet A. VM-B1/B2 on subnet 2 ; VM-A1/B1 on node 1, VM-A2/B2 on node 2.
* VM-A1 could connect to VM-B1 directly via node-1 anycast gateway, without traffic going to node 2.
* VM-A2 could connect to VM-B2 directly via node-2 anycast gateway, without traffic going to node 1.
* VM-A1 could connect to VM-A2 directly via VXLAN, without using the gateway.
* VM-A1 could connect to VM-B2 via node-1 anycast gateway which would forward it to node-2

And as those routes are in a dedicated VRF, you could have an other zone with the same network address without any conflict.

Conflict would only start to get out of your local zone... like using an exit-node and need a route back, or an other firewall gateway that will need to be connected to some "outside"...
 
Yes — what you describe is basically the key point.

VXLAN already stretches L2.
EVPN becomes interesting when you use it with VRFs and the distributed anycast gateway.

The real gain is that routing stays local on each node, per zone, with clean isolation and overlapping subnet support — especially once you start having multiple zones and external routing requirements.

So it’s less about “L2 vs L2”, and more about distributed L3 at cluster scale.
 
What are the best practice when you have completely separated networks, that need to get "out" of your proxmox cluster ?

From what I understand, exit-nodes user "default" routes from the node itself, not dependent of the zone they come from, which is not what we would want.

Different ideas I have, not sure how they can work...
- In each VRF, inject a default route to a gateway IP (held by a VM) that can get out. Use VMs that can have redundancy, with an HA IP floating between both instances. VM are getting "out" by having an interface in an other zone.
- In each VRF, inject a route to a bridge that holds a veth that bridge to some physical network, letting L2 get out of the zone to some external network. Not sure how to handle proper return path in such case, as source would not properly uses some "exit-node".

Any other ideas how to properly deals with such ?
 
You’re right, the exit-node relies on the host’s main routing table, so it’s not ideal if you want clean per-VRF egress.

In practice, if you want something predictable and clean, you have two realistic options:

  • Per-VRF firewall/edge VM

    Each VRF has its own default route pointing to a firewall/router VM inside that VRF.
    That VM then connects to a transit/external network and handles NAT and policies there.
    If you need HA, use CARP/VRRP between two edge VMs.

  • External edge device (more “datacenter style”)

    Export the VRF routes to a real router/firewall (BGP/VRF aware) and let that device handle north-south traffic.
    That keeps Proxmox as a fabric, not as an edge router.

The L2 bridge/veth idea will work in simple labs, but return traffic and isolation quickly become messy once you scale or add multiple tenants.

try to keep egress routing either inside the VRF (edge VM) or push it to a proper external edge.I wouldn’t rely on the host’s main table for multi-tenant traffic.
 
Hello.

Are you trying to build something similar to an AWS VPC multi-tenant architecture, with a VRF per zone? Or NSX like?

You can build a PoC, but if your goal is true VRF-based routing separation (independent routing tables, overlapping IP ranges), then yes
You will need manual Linux VRF configuration and possibly FRR.

Proxmox SDN alone does not provide native VRF-per-tenant abstraction.

What is missing natively in Proxmox :
  • No native VRF-per-zone (no separate routing table per tenant in the GUI)
  • No built-in tenant-aware routing domains
  • No clean support for overlapping IP ranges without manual VRF setup
  • No VRF-aware firewall
  • No cloud-style multi-tenant network control-plane (like AWS VPC / NSX VMware / OpenStack)
  • Proxmox SDN provides L2/L3 segmentation, but does not provide full VRF-based tenant routing isolation.
each zone is a different vrf in evpn with their own routing table.

(only using exit-node is doing a vrf route leak between the zone && default zone. but If you use multiple zones, I'll command to use physical routers as exit evpn nodes)

so yes, you have overlaping ip range, you have firewall at zone level btw, the control plane is disributed through bgp,...

so, you don't seem to understand how evpn is implement in proxmox sdn ? do you have really tested it ?
 
Last edited:
Thanks for the clarification.

You’re right, I misunderstood how EVPN is implemented in Proxmox SDN.

I was mainly thinking in terms of VXLAN L2 stretch and didn’t fully account for the per-zone VRF mapping.
That makes things much clearer.

Apologies for the confusion on my side.
 
  • Per-VRF firewall/edge VM

    Each VRF has its own default route pointing to a firewall/router VM inside that VRF.
    That VM then connects to a transit/external network and handles NAT and policies there.
    If you need HA, use CARP/VRRP between two edge VMs.
I fail to find how to configure that part.

How to add a default route via a VM in the VRF ? What is the right way to do that ?
 
I fail to find how to configure that part.

How to add a default route via a VM in the VRF ? What is the right way to do that ?
You will need one (or more) physical router (or router VM) capable of doing BGP in AF L2VPN EVPN that announces the default routes and handles NAT / ... from there.
 
Last edited:
Hypervisor already knows subnets, and is configured per vrf/zone to have a gateway IP, and route the trafic.

I just want to add a static route to each VRF.

I would much rather just add the static route somewhere, than have some tenant VM have some BGP open.

But if you require the VM Firewall to handle BGP and announce the default route... ok, but how do you configure peers per zones ?
 
I just want to add a static route to each VRF.

I would much rather just add the static route somewhere, than have some tenant VM have some BGP open.
You do not need to peer BGP with a tenant VM. Your router / gateway, that is handling north-south traffic, should announce the default route to the PVE nodes in the L2VPN EVPN address family for the respective VRFs. You need to configure the VRFs that are used in the EVPN zones on that router as well and announce the default routes for each zone as type-5 EVPN route. Your physical router / router VM then takes care of en / decapsulating north-south traffic, potentially NAT and also return traffic.

If peering via L2VPN EVPN AF is not possible, then you will need to designate the PVE nodes as exit node (which will leak all routes from all EVPN zones into the default routing table) and then use the BGP controller to announce the routes via the IP AF. Routing then happens via the default routing table of the PVE node. In this case you will lose the ability to use overlapping subnets, for instance, and are bound to using the default routing table.

We're working on implementing proper VRF support, so that you can announce the routes via IP AF per zone (and not leak them all into the default routing table) and then configure a different uplink for each VRF / zone. Currently, if you want to separate north-south traffic by zone you will need to use the L2VPN EVPN AF to announce a default route and then handle everything else on your gateway.


But if you require the VM Firewall to handle BGP and announce the default route... ok, but how do you configure peers per zones ?

The upside of using L2VPN EVPN AF is that you have only one BGP peering session - routes get imported into VRFs depending on the Route Target. It's of the format ASN:VNI. See above
 
We're working on implementing proper VRF support, so that you can announce the routes via IP AF per zone (and not leak them all into the default routing table) and then configure a different uplink for each VRF / zone. Currently, if you want to separate north-south traffic by zone you will need to use the L2VPN EVPN AF to announce a default route and then handle everything else on your gateway.

I think it could be done with a dedicated interface in each zone/vrf, (not sure if a vlan tagged interface could work to avoid the need to have dedicated interfaces). That's why I'm doing it with my physical router/switch currently, with a lot of zones,it's more simplier
 
I think it could be done with a dedicated interface in each zone/vrf, (not sure if a vlan tagged interface could work to avoid the need to have dedicated interfaces). That's why I'm doing it with my physical router/switch currently, with a lot of zones,it's more simplier

Yes, that's basically the idea. You'd have one VLAN per VRF (analogous to how there's a VRF-VXLAN-VNI) and then move the VLAN subinterface for the uplink into the VRF. This requires multiple BGP peering sessions (one per zone) so you're able to distinguish between the VRFs, but has the upside that you can peer with devices that can only do IPv4 / 6 AF and still separate the traffic by zone.
VNets each have a separate VLAN tag as well for L2 traffic inside a VNet and the VRF-VLAN is used for routing between VNets / north-south traffic / ... essentially utilizing VLANs instead of VXLANs as data plane.
 
I do not have, and do not want ANY external component to be aware of VRF.

My zones are PURELY internal.

I have EVPN zones with:
* different vnet
* VMs on that zone with single interfaces, talking to each other, using proxmox basic firewall/routing between them
* 1 VM (or 2 for redundancy) acting as FW/gateway/edge VM... having an interface on a VNET in that zone and an interface "elsewhere" to communicate with the rest of the world (which might just be other zones on the same cluster, or bridged to an interface going out and an IP on that range)


Manually adding the default route to the VM in the VRF works "fine enough" for VMs on the same node, but does not work any further. (route not replicated, and adding the same on other nodes would not be enough either)
 
If we have to inject from an external source, can we have a VM in the cluster, with connections to the nodes, and bgp with them to inject such routes directly?
Just route-injectors, that inject routes, stating next-hop is an IP within the VRF, while the injector stay completly out of the scope?

ie: instead of configuring it within proxmox directly, add a VM that inject BGP routes from a service running on a dedicated VM in the cluster