I've been trying to troubleshoot this problem for weeks. It first occurred when I changed my containers' interfaces from veth to venet. The problem revolves around DHCP.
The setup:
I've got 3 computers at play here:
The router (Proxmox host) has several interfaces:
[TABLE="class: outer_border, width: 500, align: left"]
[TR]
[TD]eth1[/TD]
[TD][/TD]
[TD]VLAN Trunk[/TD]
[/TR]
[TR]
[TD]eth1.2[/TD]
[TD]192.168.0.1[/TD]
[TD]Network-guests / DHCP-client VLAN[/TD]
[/TR]
[TR]
[TD]eth1.3[/TD]
[TD]10.0.0.1[/TD]
[TD]DMZ VLAN[/TD]
[/TR]
[TR]
[TD]eth0[/TD]
[TD]1.2.3.4[/TD]
[TD]WAN[/TD]
[/TR]
[TR]
[TD]venet0[/TD]
[TD][/TD]
[TD]POINTOPOINT to Containers[/TD]
[/TR]
[/TABLE]
The problem:
I suspect it doesn't get the packet. Naturally, my mind jumped to "routing issues," and so I checked the routing table. It seemed correct enough. I then tried to netcat between the router and DHCPd: I ran netcat, listening on 192.168.0.1 and successfully connected to it from the DHCPd container!
At some point, I thought Linux was filtering them out as Martian packets. I checked and double-checked after enabling logging of Martian packets. This is not the case.
As for whether the firewall could be filtering these out... I doubt it. I loosened the settings to pretty much accept anything. It still didn't fix anything.
Correct me if I'm wrong here, but I think that Linux cannot successfully "route" the packet to the process (dhcp-helper) because it left as out of the WAN interface (1.2.3.4), but returned through the venet0 interface to 192.168.0.1.
I'm kinda getting desperate here. I could just switch back to veth interfaces, but I really would prefer venet interfaces. Any help would be greatly appreciated!
The setup:
I've got 3 computers at play here:
- A DHCP client
- The Proxmox host / router running Shorewall
- DHCPd container (10.0.0.5)
The router (Proxmox host) has several interfaces:
[TABLE="class: outer_border, width: 500, align: left"]
[TR]
[TD]eth1[/TD]
[TD][/TD]
[TD]VLAN Trunk[/TD]
[/TR]
[TR]
[TD]eth1.2[/TD]
[TD]192.168.0.1[/TD]
[TD]Network-guests / DHCP-client VLAN[/TD]
[/TR]
[TR]
[TD]eth1.3[/TD]
[TD]10.0.0.1[/TD]
[TD]DMZ VLAN[/TD]
[/TR]
[TR]
[TD]eth0[/TD]
[TD]1.2.3.4[/TD]
[TD]WAN[/TD]
[/TR]
[TR]
[TD]venet0[/TD]
[TD][/TD]
[TD]POINTOPOINT to Containers[/TD]
[/TR]
[/TABLE]
The problem:
- The DHCP client sends DHCPDISCOVER packet across the 192.168.0.0/24 Subnet/VLAN
- The router's DHCP relay-agent (Debian dhcp-helper), listening on eth1.2 (192.168.0.1), relays the packet to DHCPd (10.0.0.5), presumably through venet0.
- In the DHCPd container, the packet arrives with a source IP address of 1.2.3.4 (WAN) and shortly-after a DHCPOFFER packet is sent to the relay agent (192.168.0.1), per the DHCP spec.
- Running tcpdump on the router's venet0, I can see the DHCPDISCOVER and DHCPOFFER packet go and come, respectively, but the DHCP relay agent waits continuously, never doing anything.
I suspect it doesn't get the packet. Naturally, my mind jumped to "routing issues," and so I checked the routing table. It seemed correct enough. I then tried to netcat between the router and DHCPd: I ran netcat, listening on 192.168.0.1 and successfully connected to it from the DHCPd container!
At some point, I thought Linux was filtering them out as Martian packets. I checked and double-checked after enabling logging of Martian packets. This is not the case.
As for whether the firewall could be filtering these out... I doubt it. I loosened the settings to pretty much accept anything. It still didn't fix anything.
Correct me if I'm wrong here, but I think that Linux cannot successfully "route" the packet to the process (dhcp-helper) because it left as out of the WAN interface (1.2.3.4), but returned through the venet0 interface to 192.168.0.1.
I'm kinda getting desperate here. I could just switch back to veth interfaces, but I really would prefer venet interfaces. Any help would be greatly appreciated!