Losing dhcp ip

Oct 11, 2020
35
2
13
36
I have 4 servers running PMG (debian 11.5 and last PMG stable version).

The IP is configured through DHCP from the datacenter. Everyday all VMs are losing the DHCP, and they lose at the same time (they stop to talk with each other and then they lose the external network comunication). When I use the remote console viewer from the datacenter, I can login and see that the network is down (no IP configured).

And when I try to restart the network service (/etc/init.d/networking restart, as root) I can see "RTNETLINK answers: Permission denied", but it works (however tge systemd unit networking is down). And if I reboot the VM, it will start to work again, getting the IP and syncing with node master, but in the next morning everything will happen again.

My questions:
  1. In which case PMG can take control of the network and do that? or is not possible?
  2. When I was installing this PMG Cluster, I had to manually fix the cluster.conf because, for some reason, the PMG was configuring the Hetzner private ip gateway instead of the public IP, even if that network was not configured there (the private ip). I have no idea why PMG was doing that. Also, I had to put off the ipv6 network stack because PMG was binding ipv6 instead of ipv4, without give me any option to choose. i know that PMG will listen both, but it will try to configure the nodes to use ipv6, what I do not want.
    1. Do I need to change any other file than cluster.conf to fix this?
    2. This can cause the network ip problem? I do not see any reason to PMG touch in the network config, but, well, maybe there is a reason...
 
PMG will need to be able to resolve it's hostname to a valid ip to work - in quite a few setups with DHCP this is not the case.
So if possible consider simply deploying the 4 nodes with static ips

else
Everyday all VMs are losing the DHCP, and they lose at the same time
this sounds odd - anything in the logs (of the PMG instances or the dhcp server)?

to debug this some further check:
* /etc/hosts
* /etc/hostname
* /etc/network/interfaces
* `ping $(uname -n)`
and the journal

I hope this helps!
 
Thank you for reply. Yes, hostname is solving.

About journalctl, I found some stuffs, but I am not sure how relevant they are.

for example:
Code:
Sep 27 15:21:26 gateway.domain.com pmgmirror[1232]: database sync 'gateway3' failed - DBI connect('dbname=Proxmox_ruledb;host=/run/pmgtunnel;port=7;','root',...) failed: could not connect to server: No such file or directory
Sep 27 15:23:27 gateway.domain.com pmgmirror[1232]: database sync 'gateway3' failed - DBI connect('dbname=Proxmox_ruledb;host=/run/pmgtunnel;port=7;','root',...) failed: could not connect to server: No such file or directory
Sep 27 15:25:26 gateway.domain.com pmgmirror[1232]: database sync 'gateway3' failed - DBI connect('dbname=Proxmox_ruledb;host=/run/pmgtunnel;port=7;','root',...) failed: could not connect to server: No such file or directory
Sep 27 15:27:26 gateway.domain.com pmgmirror[1232]: database sync 'gateway3' failed - DBI connect('dbname=Proxmox_ruledb;host=/run/pmgtunnel;port=7;','root',...) failed: could not connect to server: No such file or directory

my PMGs has gateway.domain.com, gateway2.domain.com, gateway3.domain.com and gateway4.domain.com names. just the gateway.domain.com exist as a dns record, and it is using cloudflare (I do not think is relevant).

Code:
Sep 27 08:19:36 gateway.domain.com pmgpolicy[117375]: getaddrinfo: Temporary failure in name resolution at /usr/share/perl5/PVE/Tools.pm line 1451.
Sep 27 08:19:36 gateway.domain.com pmgpolicy[117377]: getaddrinfo: Temporary failure in name resolution at /usr/share/perl5/PVE/Tools.pm line 1451.
Sep 27 08:19:36 gateway.domain.com pmgpolicy[117378]: getaddrinfo: Temporary failure in name resolution at /usr/share/perl5/PVE/Tools.pm line 1451.
Sep 27 08:19:36 gateway.domain.com pmgpolicy[117379]: getaddrinfo: Temporary failure in name resolution at /usr/share/perl5/PVE/Tools.pm line 1451.

checking the file, i can see this line:
Code:
    my ($err, @addrs) = Socket::getaddrinfo($nodename, undef, $hints);

but I do not know which name is failing. i can ping the server itself:
Code:
root@gateway:~# ping $(uname -n)
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.084 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.068 ms

I also seeing this:
Code:
Sep 27 15:23:27 gateway.domain.com pmgmirror[1232]: database sync 'gateway3' failed - DBI connect('dbname=Proxmox_ruledb;host=/run/pmgtunnel;port=7;','root',...) failed: could not connect to server: No such file or directory
                                                               connections on Unix domain socket "/run/pmgtunnel/.s.PGSQL.7"? at /usr/share/perl5/PMG/DBTools.pm line 66.
Sep 27 15:23:35 gateway.domain.com pmgtunnel[741]: restarting crashed tunnel 9153 1.2.3.4
Sep 27 15:23:38 gateway.domain.com pmgtunnel[741]: tunnel finished 9153 1.2.3.4

PMG need to solve the nodes by name? like gateway2.domain.com or maybe just gateway2?

I will add in /etc/hosts all the hosts just to check.
 
Seems that we found the issue:
We disabled ipv6 stack in sysctl.conf file in the past. I checked that networking service can not restart with 100% success, but the ipv4 config was being renewed and in the servers that I did this, they did not have the problem anymore.

Talking with hetzner support, they said that there is a bug in current dhclient package that will fail the entire ip renew process if we disable the ipv6 stack. I did not find this bug in some bugtrack yet, but after enable ipv6 again, I can see that networking service is restarting again, without problems. So I think this was the issue. I will return in the future to give a new feedback.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!