Ingress/egress out the same interface

JC Denton · Monday at 04:53

Hey folks,

I've been building up a small five node cluster. VLAN 192 (192.168.0.0/23) is my main, management VLAN and the nodes all have their management IPs on this VLAN (untagged). I've got VMs on a VLAN-aware bridge and a small network on VLAN 30 (10.30.0.0/24). There's an internet-connected gateway at 10.30.0.1 and a VPN gateway on 10.30.0.254. The VPN gateway is a VM in the cluster, but the internet gateway is a physical device. If a production VM in 10.30.0.0/24 in VLAN 30 is on the same node as the VPN gateway and uses it as its default gateway, it cannot get packets back through to 10.30.0.1.

Works when the VPN Gateway VM and the Prod VM are not on the same node:

Code:

ICMP Echo Request from 192.168.0.0/23 -> Gateway (192.168.1.1/10.30.0.1) -> Prod VM on Node "A" (10.30.0.x) -> ICMP Echo Reply -> VPN Gateway (10.30.0.254) -> Gateway (192.168.1.1/10.30.0.1) -> 192.168.0.0/23

For the life of me, I cannot figure out why this doesn't work when the VMs are on the same node but works when they're on separate nodes. If I tcpdump the Prod VM, I see the replies, but they never appear on the VPN gateway's tap interface on the node.

Any insights would greatly be appreciated! Thanks!

dlmw89 · Monday at 22:11

Hi, I'm not sure if I fully understand your issue.
If you have a VM in VLAN30, on the same node as your VPN GW, which you are using as Default GW for that VM, you can't ping 10.30.0.1, correct?
Can you run a tcpdump on the node and also on the 10.30.0.1 to verify if the packets are correctly forwarded?

JC Denton · Tuesday at 02:49

dlmw89 said:
Hi, I'm not sure if I fully understand your issue.
If you have a VM in VLAN30, on the same node as your VPN GW, which you are using as Default GW for that VM, you can't ping 10.30.0.1, correct?
Can you run a tcpdump on the node and also on the 10.30.0.1 to verify if the packets are correctly forwarded?

I can ping 10.30.0.1 with no issue. It's when a host in 192.168.0.0/23 (VLAN 192) tries to ping one of the VMs that's also on the node with the VPN gateway (the VM's default GW). If I tcpdump the tap interface on that node for the target VM, I see the echo request *and* the echo reply. The echo reply's destination MAC is the VPN gateway, which is intended/expected. However, if I tcpdump the VPN gateway's tap interface on the node, I don't see the echo reply.

If the two VMs (the VPN gateway VM and one of the VLAN 30 VMs) are on different nodes, this works fine. That's why I'm so confused.

EDIT: Interestingly enough, if I add a static route on 192.168.1.1/10.30.0.1 for the VLAN 30 VM across the VPN gateway directly, the traffic gets through. Maybe something weird with the kernel's routing on the nodes when both the VPN gateway VM and the VLAN 30 VM live on the same node?

JC Denton · Wednesday at 05:04

I had a hunch OVS might be the way to solve this. I reconfigured the node that currently hosted both of those VMs as a test and OVS has the traffic flowing properly. "ovs-vsctl" shows output similar to "bridge vlan," so I'm honestly not sure why traditional bridges weren't cutting the mustard.

Search

Search

Ingress/egress out the same interface

JC Denton

New Member

dlmw89

Active Member

JC Denton

New Member

JC Denton

New Member

We value your privacy