Why PVE [8.2.2] is flooding my DHCP Server?

WaxOnWaxOff

New Member
Jul 30, 2024
7
0
1
  1. I use the DHCP server in the OPNsense FW (23.x)
  2. It runs as VM on separate PVE Node [7.x.x]
  3. The new PVE Node [8.2.2] has two interfaces
  4. One interface ETH0 is used as PVE Management bridged by VMBR0 and has a static IP on the same network as the OPNsense FW (192.x.x.1/24)
  5. The other interface ETH1 is ported to VMBR2 which needs DHCP to assign it an address (192.x.x.x/24)
  6. Another Bridge VMBR3 has two 10.x.x.x/28 IP addresses assigned to it and forwards (NAT) the VMs traffic via VMBR2 - hopefully
  7. Of course the default route for the new PVE Node is set on the VMBR0 interface - 192.x.x.1/24 which is the OPNsense FW
  8. The VMs will be assigned the VMBR3 interface and adopt the 10.x.x.x/28 IP range

So, the problem is:
-the VMBR2 requests an IP address from the OPNsense FW/DHCP server
-it gets one eventually
-no VM is running at all only the new PVE Node
-when the OPNsense FW/DHCP Server leases are checked there are a number of abandoned IP addresses
-quickly pulling the cable from the new PVE Node stops this otherwise the IP addresses lease range gets used up and no other devices
on the network would be able to request a new one
-of course a manual clean-up of the DHCP leases has to be performed

Question(s):
-what gives with this?
-shouldn't the DHCP requests stop once an IP address has been given by the OPNsense FW/DHCP server?
 
Update:
-absolutely weird dhcp behaviour or I'm missing something
-on PVE Node [8.2.2] I changed /etc/network/interfaces as follows:
-ETH0 is ported to VMBR0 and still has a static IP address - 192.x.x.x with a GW to the OPNsense FW (192.x.x.1/24)​
-ETH1 is ported to VMBR2 and now has a static IP address with similar config of VMBR0​
-VMBR2 has IP_forward=1, so forwards to OPNsense FW (192.x.x.1/24)​
-VMBR3 is ported to VMBR2 and has static IP 10.x.x.x/28 with IP_forward=1, and​
post-up iptables -t nat -A POSTROUTING -s '10.100.0.0/28' -o vmbr2 -j MASQUERADE​
post-down iptables -t nat -D POSTROUTING -s '10.100.0.0/28' -o vmbr2 -j MASQUERADE
conntrack is enabled

Expectation(s):
-since all bridges have a static IP address assigned to them, why would dhclient on PVE Node [8.2.2] still attempt a dhcp request to the
OPNsense FW (192.x.x.1/24) dhcp server
-what am I missing here?

Outcome:
-the dhcp requests to OPNsense FW (192.x.x.1/24) on separate PVE Node gets overwhelmed to the point that dhcp requests are abandoned and declined
-this uses up the entire pool of dynamic IP addresses and clients whose IP addresses are expiring can't get a new IP address assigned to them until the abandoned IP addresses are deleted
-after pulling the ethernet cable from the PVE Node [8.2.2] and rebooting the OPNsense FW (192.x.x.1/24) on separate PVE Node after deleting the abandoned IP addresses, everything returns to normal
-just can't, or dare not, plug the PVE Node [8.2.2] back into the network

Is anyone else experiencing this?
This thing (DHCP request) is looping based on what I see and the OPNsense FW/DHCP server is doing its thing.

Apparently OPNsense is dropping the ISC DHCP server for the KEA DHCP server in release 24.1 (fyi).

I'm stumped and I can't do this anymore since I'm now a wanted man in my own home from people who I thought loved me!
 
Update:
-think I may have found, potentially, the issue
-on the PVE Node [7.x.x] due to unicast (assumption) somehow being invoked - perhaps by the ISP (doubtful) or some unknown bug in the ISC DHCP client or even perhaps a PVE 7.x.x bug, dhclient.leases and dhclient.<interface>.leases files were created/updated in the /var/lib/dhcp/ subfolder
-because of this, DHCP DISCOVER OFFER REQUEST ACK REQUESTs seemed to be looping and caused the DHCP server on the OPNsense FW to dispense as many IP addresses based on the dynamic IP pool range settings
-then as fast as the IP addresses were dispensed, they were abandoned as the requests were coming from unknown interfaces other than vmbr0/vtnet0 and existing clients that wanted to renew their allocated IP address(es) were not able to renew them
-in fact, this hung the DHCP service on the OPNsense FW (assumption since no device could obtain/renew their IP)

Solution:
-after reviewing several web posts with similar situations the dhclient.leases and dhclient.<interface>.leases files were backed-up
-then the dhclient.leases and dhclient.<interface>.leases files were deleted from the /var/lib/dhcp/ subfolder
-then the PVE Node [7.x.x] was rebooted
-once restarted, the /var/lib/dhcp/ subfolder contained only one dhclient.<interface>.leases file which is the Internet facing interface and successfully obtained an ISP IP address
-the syslog on PVE Node [7.x.x] was monitored for DHCP DISCOVER OFFER REQUEST ACK REQUEST messages however even after an hour none appeared
-as a further test, with fingers, toes, heck everything crossed, the other PVE Node [8.2.2] was powered-on and both nodes monitored
-i left the PVE Node [8.2.2] management and Internet interfaces set to a static IP setting and didn't wish to risk the Internet interface seeking a dynamic IP from the OPNsense FW DHCP server
-I was able to perform a dig www.google.com test on the PVE Node [8.2.2] with success

Going Forward:
-I will leave this post open after a few days of further testing which may include the PVE Node [8.2.2] requesting a dynamic IP from the OPNsense FW DHCP server

Sorry for the long replies, but I hope this helps someone!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!