Email delivery defers after upstream network outage

sagelike

Member
Dec 31, 2014
26
5
23
Hi there,

The upstream network at our server farm went down for a few minutes last night and from that point forward, almost all inbound email began to show this error:

Oct 13 01:00:05 mail-gate postfix/qmgr[28090]: 51748194139: to=<xxx@aaa.com>, relay=none, delay=29493, delays=29491/1.5/0/0, dsn=4.4.1, status=deferred (delivery temporarily suspended: connect to 127.0.0.1[127.0.0.1]: Connection refused)

This has happened twice in the past 6 months with exact same circumstances.

If upstream connection goes down for a few minutes, email starts to queue and does not begin delivering even when the network comes back up.

Proxmox needs to look into this and fix.

Also, how can I setup an alert to an external email address which will alert me when the queue reaches 50 emails, for example?

Thanks
G
 
is pmg-smtp-filter running? (If not restart it)

if it is not running - check the logs for hints why it exited

Also, how can I setup an alert to an external email address which will alert me when the queue reaches 50 emails, for example?
Quite a few monitoring systems have plugins for this (I know that check_mk warns if the postfix queue reaches a certain threshole)

I hope this helps!
 
Hi Stoiko,

Thanks for your quick response. Unfortunately, it really doesn't help.

First, this doesn't solve the root problem of why pmg-smtp-filter stops working after the upstream networking goes down. It's happened twice under exact same circumstances and shouldn't happen.

Second, the PMG should have a queue alert system built in. This is fundamental to managing an external spam filter like PMG.

I'll look into one of those queue plugins however.

Thanks
G
 
why pmg-smtp-filter stops working after the upstream networking goes down.
did it stop working? - if so do you have any indication what happened in the logs (otherwise it's not really possible to find (and fix) the cause)
 
I ran this command but didn't see anything specific to this problem:

grep ERROR /var/log/syslog

I copied the logs from just before the problem began to when it was resolved. You can see there's a gap in the logs between approx. 1 AM and 9 AM.

Is there another command I can run which might produce more insightful logs?

Oct 13 00:43:34 pmg pmg-smtp-filter[22869]: 201B8F5F854C95373EF: SA score=10/5 time=1.193 bayes=0.22 autolearn=no autolearn_force=no hits=AWL(0.513),BAYES_40(-0.001),DEAR_SOMETHING(1.973),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),RCVD_IN_MSPIKE_BL(0.001),RCVD_IN_MSPIKE_L5(0.001),RCVD_IN_PBL(3.335),RCVD_IN_PSBL(2.7),RCVD_IN_RP_RNBL(1.31),RCVD_IN_SBL(0.141),RDNS_NONE(0.793),T_SPF_HELO_PERMERROR(0.01),T_SPF_PERMERROR(0.01) Oct 13 00:52:01 pmg pmg-smtp-filter[22992]: 20267B5F854E910062A: SA score=7/5 time=0.496 bayes=0.50 autolearn=no autolearn_force=no hits=AWL(-0.673),BAYES_50(0.8),HTML_IMAGE_ONLY_20(1.546),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),KAM_NUMSUBJECT(0.5),RCVD_IN_BL_SPAMCOP_NET(1.347),RCVD_IN_SBL_CSS(3.335),RDNS_NONE(0.793),SPF_HELO_NONE(0.001),T_REMOTE_IMAGE(0.01),T_SPF_PERMERROR(0.01) Oct 13 00:54:52 pmg pmg-smtp-filter[22992]: 20267B5F854F2C3120F: SA score=10/5 time=15.842 bayes=0.50 autolearn=no autolearn_force=no hits=AWL(1.864),BAYES_50(0.8),BODY_URI_ONLY(0.822),DATE_IN_PAST_06_12(1.543),HTML_MESSAGE(0.001),KAM_DMARC_REJECT(3),KAM_DMARC_STATUS(0.01),KAM_SHORT(0.001),MIME_HEADER_CTYPE_ONLY(0.1),MIME_HTML_ONLY(0.1),RCVD_IN_XBL(0.375),RDNS_NONE(0.793),SHORTENED_URL_HREF(0.999),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 00:57:46 pmg pmg-smtp-filter[22992]: 20267B5F854FDD3D093: SA score=0/5 time=13.167 bayes=0.00 autolearn=no autolearn_force=no hits=AWL(0.089),BAYES_00(-1.9),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HTML_FONT_LOW_CONTRAST(0.001),HTML_IMAGE_RATIO_08(0.001),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),KAM_REALLYHUGEIMGSRC(1.1),LOTS_OF_MONEY(0.001),MIME_HTML_ONLY(0.1),RCVD_IN_DNSWL_LOW(-0.7),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 00:57:47 pmg pmg-smtp-filter[22869]: 2027DC5F854FDF293BC: SA score=1/5 time=11.901 bayes=0.00 autolearn=no autolearn_force=no hits=BAYES_00(-1.9),HTML_FONT_SIZE_HUGE(0.001),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),KHOP_HELO_FCRDNS(0.4),MIME_HTML_ONLY(0.1),SPF_HELO_NONE(0.001),TO_EQ_FM_DOM_HTML_ONLY(0.232),T_SPF_TEMPERROR(0.01),URI_PHISH(3.002) Oct 13 01:02:25 pmg pmg-smtp-filter[23061]: 2027DE5F8550F30B1ED: SA score=0/5 time=14.322 bayes=0.00 autolearn=ham autolearn_force=no hits=AWL(-1.677),BAYES_00(-1.9),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),RCVD_IN_DNSWL_MED(-2.3),RDNS_NONE(0.793),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 01:02:28 pmg pmg-smtp-filter[22992]: 2027E05F8550F585DED: SA score=0/5 time=14.820 bayes=0.00 autolearn=ham autolearn_force=no hits=AWL(-1.277),BAYES_00(-1.9),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),RCVD_IN_DNSWL_MED(-2.3),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 01:02:39 pmg pmg-smtp-filter[23142]: 2027E35F8551002653A: SA score=0/5 time=15.302 bayes=0.00 autolearn=ham autolearn_force=no hits=AWL(-1.274),BAYES_00(-1.9),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),RCVD_IN_DNSWL_MED(-2.3),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 01:02:40 pmg pmg-smtp-filter[23061]: 2027DD5F8551016DB7C: SA score=0/5 time=15.008 bayes=0.00 autolearn=ham autolearn_force=no hits=AWL(-1.271),BAYES_00(-1.9),DKIM_INVALID(0.1),DKIM_SIGNED(0.1),HTML_MESSAGE(0.001),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),RCVD_IN_DNSWL_MED(-2.3),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 08:56:42 pmg pmg-smtp-filter[1070]: 2028785F85C023B2073: SA score=8/5 time=6.283 bayes=0.00 autolearn=no autolearn_force=no hits=BAYES_00(-1.9),FREEMAIL_FORGED_FROMDOMAIN(0.249),FREEMAIL_FROM(0.001),HEADER_FROM_DIFFERENT_DOMAINS(0.25),HTML_MESSAGE(0.001),HTML_MIME_NO_HTML_TAG(0.377),KAM_DMARC_NONE(0.25),KAM_DMARC_STATUS(0.01),MIME_HTML_ONLY(0.1),PDS_HP_HELO_NORDNS(0.001),RCVD_IN_PBL(3.335),RCVD_IN_SBL_CSS(3.335),RCVD_IN_XBL(0.375),RDNS_NONE(0.793),SPOOFED_FREEMAIL(1.651),SPOOFED_FREEMAIL_NO_RDNS(0.001),T_SPF_HELO_TEMPERROR(0.01),T_SPF_TEMPERROR(0.01) Oct 13 08:56:47 pmg pmg-smtp-filter[1074]: 2027E05F85C025B78A9: SA score=0/5 time=9.304 bayes=0.00 autolearn=ham autolearn_force=no hits=AWL(0.123),BAYES_00(-1.9),DKIM_SIGNED(0.1),DKIM_VALID(-0.1),DKIM_VALID_AU(-0.1),HTML_FONT_LOW_CONTRAST(0.001),HTML_IMAGE_RATIO_02(0.001),HTML_MESSAGE(0.001),KAM_LOTSOFHASH(0.25),RCVD_IN_DNSWL_NONE(-0.0001),RCVD_IN_RP_CERTIFIED(-3),RCVD_IN_RP_SAFE(-2),SPF_PASS(-0.001),T_SPF_HELO_TEMPERROR(0.01)
 
I copied the logs from just before the problem began to when it was resolved.
if you know when the issue began and when it ended something like:
Code:
journalctl --since '2020-10-13 00:43:00' --until '2020-10-13 08:56:42'
would help (not all problems log with 'ERROR')
adapt the timestamps to when the problem started ended
 
That's odd. No logs before 8:56 AM

Code:
root@pmg:~# journalctl --since '2020-10-13 00:43:00' --until '2020-10-13 09:30:00'
-- Logs begin at Tue 2020-10-13 08:56:16 MDT, end at Tue 2020-10-13 11:13:28 MDT. --
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!