"Network not reachable"

otter90

New Member
Feb 24, 2020
5
1
3
53
Dear community,
i have a strange network issue with containers and no solution so far. In my proxmox installation i have various VMs and containers for different purposes (iobroker, zoneminder, fhem, weewx, CupsPrint, Raspberrymatic...). For two of my containers (used for iobroker and zoneminder, both debian) the network connection breaks down after exactly 10 days after reboot and the containers are not reachable any more via network. They are up and running and I can access them via proxmox GUI/console. Every network related command (e.g. ping xyz) within the console is being responded with the error message "Network not reachable". When I stop and reboot the containers, everything works fine again, for exactly 10 days...
Of course I can post some more specifications of the containers if required for analysis.
Would be glad if anyone had an idea.
Thanks + Regards
otter90

[EDIT] I just made an additional observation. If I do a reboot before the network breaks down, e.g. 5 days after the last failure, then the next breakdown still will occur after the 10 days of the first reboot (and not 10 days after the second). Which means that only a reboot immediately after a breakdown will grant the new 10 day uptime period. Example:
[01.01.] network breakdown, manual reboot
[05.01] manual reboot
[11.01.] network breakdown, manual reboot
[21.01.] network breakdown, manual reboot
and so on ...
 
Last edited:
Is there anything relevant in the syslog from the day when the network breaks down?
 
Thanks for asking. Hope this is the right part of the logs. At least it says something about network, at the point in time of the last network crash:
Jul 6 08:27:31 IOBrokerLXC ntpd[201]: Deleting interface #3 eth0, 192.168.200.93#123, interface stats: received=9284, sent=9284, dropped=0, active_time=863703 secs
Jul 6 08:27:31 IOBrokerLXC ntpd[201]: 192.168.200.1 local addr 192.168.200.93 -> <null>
Jul 6 08:27:34 IOBrokerLXC bash[193]: Error: connect ENETUNREACH 192.168.200.21:8999 - Local (0.0.0.0:0)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at internalConnect (net.js:923:16)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at defaultTriggerAsyncIdScope (internal/async_hooks.js:313:12)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at net.js:1011:9
Jul 6 08:27:34 IOBrokerLXC bash[193]: at processTicksAndRejections (internal/process/task_queues.js:79:11) {
Jul 6 08:27:34 IOBrokerLXC bash[193]: errno: 'ENETUNREACH',
Jul 6 08:27:34 IOBrokerLXC bash[193]: code: 'ENETUNREACH',
Jul 6 08:27:34 IOBrokerLXC bash[193]: syscall: 'connect',
Jul 6 08:27:34 IOBrokerLXC bash[193]: address: '192.168.200.21',
Jul 6 08:27:34 IOBrokerLXC bash[193]: port: 8999
Jul 6 08:27:34 IOBrokerLXC bash[193]: }
Jul 6 08:27:34 IOBrokerLXC bash[193]: Messages Error: Error: connect ENETUNREACH 192.168.200.21:8999 - Local (0.0.0.0:0)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at internalConnect (net.js:923:16)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at defaultTriggerAsyncIdScope (internal/async_hooks.js:313:12)
Jul 6 08:27:34 IOBrokerLXC bash[193]: at net.js:1011:9
 
[EDIT] I just made an additional observation. If I do a reboot before the network breaks down, e.g. 5 days after the last failure, then the next breakdown still will occur after the 10 days of the first reboot (and not 10 days after the second). Which means that only a reboot immediately after a breakdown will grant the new 10 day uptime period. Example:
I.e., the network goes down roughly every ten days, independent of any reboot?

Thanks for asking. Hope this is the right part of the logs. At least it says something about network, at the point in time of the last network crash:
Anything on the host's syslog around that time? You can use the syslog panel in the Proxmox VE's web-interface (Node -> Syslog) or the journalctl CLI tool.

Any firewall configured or the like?
 
It's not "roughly" every 10 days, it's absolutely exactly 10 days, as if someone scheduled a countdown.
I have set a FHEM routine on another client that permanently pings this machine, so I know very exactly when it crashes.
Firewall is not configured (to my knowledge). This is the hosts log at the same time:
Jul 06 08:27:26 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:26 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:26 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:26 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:27 pve pvedaemon[1931]: got timeout
Jul 06 08:27:27 pve pvedaemon[14022]: got timeout
Jul 06 08:27:27 pve pvestatd[1019]: got timeout
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:28 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:29 pve pvestatd[1019]: status update time (5.067 seconds)
Jul 06 08:27:31 pve kernel: rpc_check_timeout: 22 callbacks suppressed
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: call_decode: 37 callbacks suppressed
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 not responding, still trying
Jul 06 08:27:31 pve kernel: nfs: server 192.168.200.91 OK
Jul 06 08:27:36 pve kernel: rpc_check_timeout: 43 callbacks suppressed
 
Saure issue here, the VMs frequently loose their IPv6 connection and the host itself lost the IPv4 (DHCP) connection a few hours ago

Code:
ntpd[1064]: Deleting interface #3 eth0, 198.xxx.147.87#123, interface stats: received=972, sent=942, dropped=0, active_time=86396 secs
 
Hi @otter90 / @Vengance

Do you already know the solution to this problem? I have already a year and a half restarting my server every day and until now it was that I decided to look for the problem and I came across this post.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!