PMG - No route to host, Network Unreachable

alexkenon

Active Member
Dec 20, 2018
20
0
41
36
Hi. There are 2 PNG (there are 2 MX records). Both PMG are in the cluster. The first PMG sends emails through the first ISP, the second PMG sends emails through the second ISP. PMG version 8.0.7.

DNS servers on domain controllers (MS Active Directory). The NTP servers are specified by Google (8.8.8.8 and 8.8.4.4).

Both PMG have been working like this for 3-4 years. There were no problems at all.

Now for 10 days there is a problem with the second PMG. The settings were not changed, no new settings were made. The Internet provider checked everything, there are no problems on their part.

The problem is the following: on the second PMG periodically (quite often) outgoing emails get into the queue. At the same time, there are no problems with incoming emails (they all arrive at once). Outgoing emails end up in a queue with statuses: "No Route to host" and "Network Unreachable". If you go into the queue and execute the "Flush Queue", then as luck would have it, the letters can leave immediately, or they can only on the 20th attempt (thus, something will leave in 1 minute, something in 30 minutes). That is, when you click the Flush Queue, the emails will either leave immediately or you need to wait for time, otherwise they will again get in line with the same statuses. I didn't understand what it depends on. It's as if he doesn't understand where to send him, and then from the 10th or 20th attempt he understands where he needs to go.

I repeat that everything is perfect at the first PMG. There are no problems with incoming messages on the second PMG, the problem is only with outgoing emails.

There is no additional information in the logs. Everything is the same as in the screenshot. There's just the same status No route to host, Network Unreachable. That is, there is no additional information to grab onto.

What are the possible options? All settings on both hosts have not changed and are identical.
 

Attachments

  • 1.PNG
    1.PNG
    11.3 KB · Views: 9
  • 2.PNG
    2.PNG
    27.1 KB · Views: 9
Last edited:
Hm - on a first look it sounds as if the second PMG cannot reach the hosts in question - if it's many different hosts (i.e. mails to different domains) - my guess would be that there is an issue with your second ISP ...

Do you see anything suspicious in the logs (in the timeframe when mails cannot be sent through your second PMG - check them with `journalctl --since '2023-10-18 16:03:00' --until '2023-10-19 17:00:00'` (adapt the times to when you actually experienced issues...)

I hope this helps!
 
  • Like
Reactions: alexkenon
Hm - on a first look it sounds as if the second PMG cannot reach the hosts in question - if it's many different hosts (i.e. mails to different domains) - my guess would be that there is an issue with your second ISP ...

Do you see anything suspicious in the logs (in the timeframe when mails cannot be sent through your second PMG - check them with `journalctl --since '2023-10-18 16:03:00' --until '2023-10-19 17:00:00'` (adapt the times to when you actually experienced issues...)

I hope this helps!

Hi! Thanks for quick reply!

After all, most likely it's the provider?

The provider replied as follows:
From the screenshot I see that this is the output of web server errors.
I do not know by what criterion these errors are classified. But I assume from the ICMP response codes to incoming connections sent by the far side.


You should refer to the documentation of your mail server for an explanation of what these errors mean and what is associated with. But it's definitely not related to the data network, rather with some kind of anti-spam or anti-fraud sheets.

Our e-mail transport from this subnet functions perfectly, everything is delivered to everyone. We provide default route, what you described, no route no host, doesn't make sense in the context of the problem.
Connectivity (with other public networks) in our autonomous system is very high.
You can diagnose availability and connectivity by ping and traceroute, respectively.

I must say right away that we pinged the gateway of the second ISP at those moments when the mail did not leave and got in line. Ping passed, there was no packet loss.

And there was also one feature: from October 11 to October 19, there were 2 days when there were no problems with the second PMG at all. And this is very suspicious. We didn't touch or change any settings then.

The provider says that it's about PMG, what should he fend off? =/
 
As said - check the journal when the issue occurs - if you happen to notice while it's going on the usual steps for verifying where the issue is are:
* ping the IP of your default gateway
* ping an IP on the public internet (8.8.8.8 is a good target)
* ping a dns-name on the public internet (google.com)

if all of those work fine and there's still issues - see if you can reach the target smtp-server on port 25 (nc -v <target-host-ip> 25)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!