VXLAN Issues

talnet23

New Member
Jun 20, 2026
3
0
1
Hi all,

I am configuring multiple VXLANs between two nodes I have in a cluster and I'm getting some very very weird results. I have my VMBR0 interfaces operating as my mgmt network just fine. Then I have configured the following:

VXLAN: Range (mtu 1450, Peers: PVE, PVE2)
VXLAN: SharedSvc (mtu 1450, Peers: PVE, PVE2)
Both of those have been working nothing else is configured on them, they just have the zone and assocaited VNET to them. However, when I add a third:

VXLAN: wSpace (mtu 1450, Peers: PVE, PVE2)

Connectivity on that third space just goes wappy. For example between all of the VXLANs I have PFSense operating as the firewall and gateway, as well as DHCP. if I boot a VM on PVE and PFsense is on the same node (PVE) I don't get DHCP, no connectivity, nothing. Tried tracing for ARPings and see it leave the VM but never hit PFSense. If I boot up a VM on PVE2 it works. If I move PFSense to PVE2 everything stops. It's mega weird. I don't know if it's because I'm trying to use too many VXLANs but can't see why it would be a problem. Has anyone had anything odd like this before and how did you fix it? I'm running 9.1.11 at the moment on both nodes.
 
Are you using SDN, I assume? Can you post the SDN configuration as well as the running network config?

Code:
cat /etc/pve/sdn/zones.cfg
cat /etc/pve/sdn/vnets.cfg
cat /etc/pve/sdn/subnets.cfg
ip a
ip r
 
  • Like
Reactions: news
Hi Stefan,

Yes, I am using SDN. I have managed to find a workaround to this, it's quite odd whilst the issues seemed to span multiple VNets once messing about with firewalls, etc, rebooting it caused the issue to reside on the single VNet. What I found was I was building Kali VMs through Ansible and bringing them all up at the same time after cloning from a template. For whatever reason this was messing up the networking, changing the ordering to clone each one and then bring online seems to have resolved it. Very weird that it was misbehaving like it was but at least I have a fix for now. For clarity I'm attaching the outputs as requested. The only thing that I don't have is a subnets.cfg, the VNet in question is wrkspace. The other two are fine again since the reboot and were fine before I started messing with stuff.

Regards,
Tal
 

Attachments

Hm, do you have the same workflow on the other VXLAN vnets (i.e. are you creating mass clones via ansible there as well?). If not, does the same thing happen when creating VMs via cloning on the other subnets?
Is it possible there are some duplicate IP addresses / MAC addresses in the cloned guests, leading to the wonky networking?
 
honestly, no I'm not doing any ansible deployments on the other vnets just the wrkspace one. When i've cloned before on the other vnets its just worked fine.

I did wonder duplicate IP Addresses/MAC addresses however, upon inspection I saw MAC Addresses were different and only some got addresses for example:

PVE - PFSense, VM1
PVE2 - VM2

VM1 didn't get an IP even after a reboot, statically assigning an address showed no connectivity to PFSense nor VM2. Migrate VM1 to PVE2, address was obtained and everything started working, migrate back to PVE everything stopped working on that VM and it couldn't connect to PFSense at all. VM2 seemed to be fine but didn't test migrating to PVE. It was all a bit odd as I would have expected no VXLAN requirement between VM1 and PFsense on the same host but still no IP address or intra-networking.

It also seems odd that changing the ansible code to Clone -> Start -> Clone -> Start doesn't really explain it but it works, instead of Clone -> Clone -> Start all.