How to build isolated VXLAN networks across Proxmox cluster nodes — and why they still can't reach outside

Fearless-Grape5584

New Member
Dec 9, 2025
11
4
3
Before getting into it: this is not meant to be the only or universally best way to do this.
You can solve similar problems with router VMs, OPNsense/pfSense, or more fabric-style designs.
This post is specifically about a minimal-resource, practical pattern for making isolated VXLAN networks across Proxmox cluster nodes usable from outside the overlay.

I kept seeing the same kind of question around Proxmox VXLAN setups:
  • “My VMs can talk across nodes, so VXLAN seems to work”
  • “But I still can't reach the internet”
  • “Or I can't reach those VMs from an external PC on the LAN”

The confusing part is that Proxmox makes it easy to define a VXLAN subnet, including a gateway field in the GUI.
That often gives the impression that once the subnet is configured, routing is already there.
But that is exactly the trap.

The short version

If you build a VXLAN-backed isolated network across Proxmox cluster nodes:

  • VXLAN itself can provide L2 extension across nodes
  • VMs on that VXLAN can often talk to each other just fine
  • but that does not automatically create a real L3 gateway/router
  • and it also does not automatically solve return routing from your upstream LAN/router

So the typical “VXLAN works, but nothing can reach outside” problem is usually not a VXLAN problem at all.
It is a gateway placement + return route problem.

The important misunderstanding: the subnet gateway field is not a real router

In a Proxmox VXLAN zone, setting the gateway value in the subnet definition does not mean Proxmox will spawn a real router for that subnet.
It defines addressing information for the subnet.
That is useful, but it is not the same thing as having an actual next hop that forwards traffic between:
  • your VXLAN tenant subnet
  • your physical LAN
  • and possibly the internet

So if your VM is attached only to that VXLAN-backed network, the following can both be true at the same time:
  • VM ↔ VM communication across nodes works
  • external connectivity still does not work

That is why this issue is so easy to misread.
The overlay is fine. The routing path is incomplete.

What you actually need

To make that isolated VXLAN subnet reachable from outside the overlay, you typically need two things.

1. A real gateway IP on one Proxmox node

One node has to act as the actual entry point / gateway for that VXLAN subnet.
For example, if your tenant subnet is 10.200.0.0/24 and you want 10.200.0.254 to be the gateway inside that VXLAN subnet, then one Proxmox node needs to actually own that IP on the relevant VNet interface.
Something like:

Code:
ip addr add 10.200.0.254/24 dev <your-vnet-interface>

At that point, that node becomes the real L3 hop for the VXLAN subnet.
Without this, the subnet may exist, and the VMs may talk at L2, but there is still no real router for traffic leaving that subnet.

2. A return route from your upstream LAN/router

Even after putting a real gateway IP on one Proxmox node, traffic still may not work unless the upstream side knows how to get back to that VXLAN subnet.
That usually means your upstream router, or sometimes your PC for testing, needs a static route such as:

Code:
10.200.0.0/24 via <IP of the chosen Proxmox node on vmbr0>

This is the other half of the trap.
People often focus only on “how do I put a gateway on VXLAN?”
But even if you do that correctly, return traffic still dies unless the upstream network knows where that subnet lives.

Why this is confusing in practice

The reason this catches people is that partial success looks very convincing.
You see:
  • VMs on different nodes can ping each other
  • MAC learning seems fine
  • the VXLAN itself appears healthy

So naturally you start suspecting firewall rules, NAT, or some obscure Proxmox issue.
But very often, the real issue is much simpler:
  • there is no actual gateway IP on the VXLAN network
  • or the upstream network has no route back

In other words:
the overlay is working, but the path in and out of that overlay is not fully designed yet

A minimal working traffic flow

Once a real gateway exists on one node, and the upstream router has a static route, the flow becomes something like this:

Code:
External PC / LAN
    ↓
upstream router
    ↓  static route for 10.200.0.0/24
chosen Proxmox node on vmbr0
    ↓
gateway IP on VXLAN tenant network
    ↓
target VM inside the VXLAN subnet

And replies come back the same way.
That is the point where the setup stops being “just a VXLAN definition” and becomes an actual routed design.

Persistence matters too

Even if you prove the concept manually with `ip addr add`, the next problem is persistence.
After reboot or network reload, that gateway IP has to come back.
One practical approach is to restore it automatically when the VNet interface comes up, for example with an `if-up.d` hook, and remove it with `if-down.d`.

Example:
/etc/network/if-up.d/mslsetup-vxlan-gw

Code:
#!/bin/bash
case "${IFACE:-}" in
    <your-vnet-interface>)
        ip addr replace 10.200.0.254/24 dev <your-vnet-interface>
        ;;
    *)
        exit 0
        ;;
esac

/etc/network/if-down.d/mslsetup-vxlan-gw

Code:
#!/bin/bash
case "${IFACE:-}" in
    <your-vnet-interface>)
        ip addr del 10.200.0.254/24 dev <your-vnet-interface> 2>/dev/null
        ;;
    *)
        exit 0
        ;;
esac

That gives you a workable single-owner gateway model on Proxmox.
But it also reveals the next design question.

The real cluster question is not VXLAN — it is gateway ownership

Once you move beyond a single-node proof of concept, the real question becomes:
  • Which node should own the gateway?
  • What happens if that node fails?
  • How do you avoid external access depending on one fixed node forever?

That is why in multi-node isolated VXLAN designs, the hard part is often not the overlay itself.
The hard part is:

  • gateway placement
  • failover
  • return routing
  • operational persistence

In a cluster, the return path should usually target a VIP

A fixed gateway owner may be acceptable in a single-node or proof-of-concept setup.
But in a real cluster, pointing the upstream static route to one permanently chosen Proxmox node creates another problem: if that node fails, the return path breaks even if the VXLAN overlay itself is still healthy.
So in a cluster design, the upstream static route should usually point to a VIP on vmbr0, not to one specific node's fixed IP.

That way, the return route always follows the currently active gateway owner.
For example, instead of:

Code:
10.200.0.0/24 via 192.168.1.11

you would use:

Code:
10.200.0.0/24 via 192.168.1.100

where `192.168.1.100` is a floating VIP that moves to the node currently responsible for the VXLAN gateway.
This is the key difference between a lab-only routed VXLAN setup and a cluster-ready routed VXLAN setup.

A practical floating gateway approach with keepalived

Once you reach that point, the practical question becomes:
  • how do I move the return-path endpoint between nodes?
  • how do I attach the VXLAN gateway IP only on the active node?
  • how do I keep that behavior simple enough to operate?

A straightforward approach is to use keepalived.

keepalived can:

  • move a VIP on `vmbr0` between Proxmox nodes
  • elect the active node
  • run a script when the active node changes

That script can then attach or detach the real VXLAN gateway IP on the relevant VNet interface.

In other words:

  • the VIP on vmbr0 becomes the upstream return-path target
  • the gateway IP inside the VXLAN subnet becomes a floating gateway owned only by the active node

That gives you a much more resilient design than pinning the whole routed path to one fixed node forever.
For example, tools like keepalived can elect one active node and move the gateway IP (`10.200.0.254`) between nodes automatically.
keepalived can also run an external script when a node becomes active, so you can assign the gateway IP dynamically only on the active node.

Example configuration for `/etc/keepalived/keepalived.conf`:

Code:
global_defs {
    enable_script_security
    script_user root
}

vrrp_script check_gw_reachability {
    script "/usr/bin/ping -c 1 -W 1 <upstream router IP>"
    interval 2
    weight -20
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface vmbr0
    virtual_router_id 100
    priority 100
    advert_int 1
    nopreempt

    authentication {
        auth_type PASS
        auth_pass <password>
    }

    virtual_ipaddress {
        <VIP on vmbr0>/<CIDR> dev vmbr0
    }

    track_script {
        check_gw_reachability
    }

    notify_master "/usr/local/bin/msl-vip-hook.sh master"
    notify_backup "/usr/local/bin/msl-vip-hook.sh backup"
    notify_stop   "/usr/local/bin/msl-vip-hook.sh stop"
    notify_fault  "/usr/local/bin/msl-vip-hook.sh fault"
}

On the other node, you can use the same configuration with a lower priority.

Example msl-vip-hook.sh:

Code:
#!/bin/bash
set -euo pipefail

ACTION="${1:-}"
VNET_IF="<your-vnet-interface>"
GW_IP="10.200.0.254/24"

log() {
    logger "$*"
}

case "$ACTION" in
    master)
        log "notify_master start"
        ip addr replace "$GW_IP" dev "$VNET_IF" || true
        # Better run arping here
        log "Became MASTER: VIP/GW attached and GARP completed"
        ;;
    backup|fault|stop)
        log "notify_${ACTION} start"
        ip addr del "$GW_IP" dev "$VNET_IF" 2>/dev/null || true
        log "Became ${ACTION}: GW detached"
        ;;
    *)
        log "Unknown action: $ACTION"
        exit 1
        ;;
esac

So for this kind of access pattern, you may not need a full EVPN-based fabric or an enterprise-style BGP/ECMP design just to let an external PC reach the VXLAN subnet.

Why this matters for isolated tenant-style networks

This becomes especially important when VXLAN is not just a lab experiment, but part of an isolated multi-tenant design.
If you are using VXLAN to stretch isolated project networks across Proxmox cluster nodes, you usually want all of this together:
  • tenant VM ↔ tenant VM communication across nodes
  • controlled external access
  • predictable routing
  • node failure handling

And that means the question is no longer:

“How do I make Proxmox VXLAN work?”

It becomes:

“How do I make an isolated cross-node network actually usable and survivable?”

That is a much more useful framing.

So in summary

If your Proxmox VXLAN network works across nodes but still cannot reach outside, the missing pieces are often:

  1. a real gateway IP placed on one Proxmox node
  2. a static return route from the upstream side
  3. in a real cluster design, a plan for gateway failover
  4. in that cluster design, a VIP-based return path rather than a fixed-node return path

So the trap is not that VXLAN is broken.
The trap is that a working L2 overlay can make you believe the routing problem is already solved, when it is not.

The GUI subnet gateway value is not the router.
The router has to exist somewhere.
And the upstream path has to know how to get back.

That is the part that tends to be missing in many examples.

I’m actually using this design in my open-source multi-tenant Proxmox project, MSL Setup.

So this is not just a theoretical idea — it comes from trying to make isolated cross-node VXLAN networks practical in a real implementation.

If it helps, here is the repository:
https://github.com/zelogx/msl-setup

If useful, I can also post a simplified diagram version of this setup in a follow-up.
 
Hi,

just an addition:

I would be nice if the BGP controller would also work with VXLAN networks. You could simply advertise your VXLAN network(s) to the upstream router via BGP. This can also handle failover, loadbalancing via ECMP etc.

AFAIK it is currently tied to EVPN networks only.