before queue: spurious SMTP errors

IEM

Well-Known Member
Sep 4, 2018
71
11
48
49
Austria
since i've enabled the "before queue" filtering, i get notifications of smtp-errors every now and then:

Subject: Proxmox SMTP server: errors from outgoing.example.com[198.51.100.97]

Out: 220 mailgate.mydomain.test ESMTP NodeProxy.new.play
In: HELO outgoing.example.com
Out: 250 mailgate.mydomain.test
In: MAIL FROM:<somebody@example.com>
Out: 250 2.1.0 Ok
In: RCPT TO:<user1@mydomain.test>
Out: 250 2.1.5 Ok
In: DATA
Out: 354 End data with <CR><LF>.<CR><LF>
Out: 451 4.3.0 Error: queue file write error
In: QUIT
Out: 221 2.0.0 Bye

the sender/receiver/sending-host tuple looks like totally legit (and includes "trusted" mailserverse; "trusted" in the sense that i trust them to do a good job and not send out spam; but not that i have whitelisted them).

so i checked my logs and for each SMTP error mail, i get a corresponding entry in `mail.log`:

Dec 8 14:05:20 mailgate postfix/smtpd[31568]: warning: timeout talking to proxy 127.0.0.1:10024
Dec 8 14:05:20 mailgate postfix/smtpd[31568]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<somebody@example.com> to=<user1@mydomain.test> proto=SMTP helo=<outgoing.example.com>

i understand that using "before queue" enables a few postfix options that are related to speed-limiting the connection with the spamassassing-proxy (-o smtpd_proxy_options=speed_adjust -o smtpd_client_connection_count_limit=10)
i guess, this speed-limiting is responsible for the smtp-error notifications i'm receiving.

however, since my mailgateways rejects these emails with a temporary 4xx error as it should, I don't understand why I need to receive these error notifications in the first place.
is this just a glitch i'm experiencing as a "beta-tester" of "before queue" filtering, which will go away in the next iterations of PMG?
 
Hi,

Thanks for reporting back your experiences!

hm - AFAIR the responses when postfix start limiting the connections from one client (smtp-server trying to send mail to PMG), and the one from the speed_adjust are different ones.

the queue file write error is most likely the result of the timeout smtpd gets from connecting to `pmg-smtp-filter` on port 10024.
(in before queue filtering the filter/proxy is used instead of writing a queue file...)

A few questions which should help in narrowing down the cause of this:
* How many `pmg-smtp-filter` processes are running when you get those temp-fails in the logs?
* could you share the anonymized log of when this happens?
* could you share your rendered '/etc/postfix/master.cf'?
* do many concurrent mails come from the same source-mailserver? (then maybe the client_connection_count_limit can be adjusted)

Thanks!
 
* How many `pmg-smtp-filter` processes are running when you get those temp-fails in the logs?

hard to tell, as i only get a notification after the fact

* could you share the anonymized log of when this happens?

which parts do you want? what time-frame? which services?
i already attached an anoymized 2-line excerpt to the original question

* could you share your rendered '/etc/postfix/master.cf'?

Code:
26       inet  n -       -       -       100      smtpd
  -o smtpd_proxy_filter=127.0.0.1:10023
  -o smtpd_proxy_options=speed_adjust
  -o smtpd_client_connection_count_limit=10
  -o smtpd_recipient_restrictions=permit_mynetworks,reject_unauth_destination
  -o smtpd_helo_restrictions=
  -o smtpd_client_restrictions=
  -o smtpd_sender_restrictions=

25       inet  n -       -       -       1 postscreen

smtpd       pass  - -       -       -       100      smtpd
  -o smtpd_proxy_filter=127.0.0.1:10024
  -o smtpd_proxy_options=speed_adjust
  -o smtpd_client_connection_count_limit=10
  -o receive_override_options=no_address_mappings
  -o smtpd_discard_ehlo_keywords=silent-discard,dsn
  -o mynetworks=127.0.0.0/8,192.168.171.26

127.0.0.1:10025 inet  n       -       n       -       -      smtpd
  -o content_filter=
  -o local_recipient_maps=
  -o relay_recipient_maps=
  -o receive_override_options=no_unknown_recipient_checks,no_header_body_checks
  -o smtpd_helo_restrictions=
  -o smtpd_client_restrictions=
  -o smtpd_restriction_classes=
  -o smtpd_sender_restrictions=
  -o smtpd_recipient_restrictions=permit_mynetworks,reject
  -o mynetworks=127.0.0.0/8
  -o smtpd_error_sleep_time=0
  -o smtpd_client_connection_count_limit=0
  -o smtpd_client_connection_rate_limit=0
  -o smtpd_tls_security_level=none
  -o smtpd_authorized_xforward_hosts=127.0.0.0/8
  -o message_size_limit=52428800

pickup    fifo  n       -       -       60      1       pickup
cleanup   unix  n       -       -       -       0       cleanup
  -o message_size_limit=52428800

qmgr      fifo  n       -       -       300     1       qmgr
rewrite   unix  -       -       -       -       -       trivial-rewrite
bounce    unix  -       -       -       -       0       bounce
defer     unix  -       -       -       -       0       bounce
flush     unix  n       -       -       1000?   0       flush
proxymap  unix  -       -       n       -       -       proxymap
smtp      unix  -       -       -       -       -       smtp
relay     unix  -       -       -       -       -       smtp
showq     unix  n       -       -       -       -       showq
error     unix  -       -       -       -       -       error
local     unix  -       n       n       -       -       local
virtual   unix  -       n       n       -       -       virtual
lmtp      unix  -       -       n       -       -       lmtp
verify    unix  -       -       -       -       1       verify
trace     unix  -       -       n       -       0       bounce
tlsmgr    unix  -       -       -       1000?   1       tlsmgr
anvil     unix  -       -       -       -       1       anvil
scache    unix  -       -       -       -       1       scache
discard   unix  -       -       -       -       -       discard
retry     unix  -       -       -       -       -       error
dnsblog   unix  -       -       -       -       0       dnsblog
tlsproxy  unix  -       -       -       -       0       tlsproxy

* do many concurrent mails come from the same source-mailserver? (then maybe the client_connection_count_limit can be adjusted

i don't think so.
on one day i got a total of 9 connections from a given remote mailserver, the first one was rejected (**450**) for greylisting, when they tried again 5 minutes lates they were rejected (**451**) because of the proxy timeout, and the next time they tried was 8 minutes later (success with **250**).

here's the anonymized log (grep outgoing.example.com)

Code:
Dec  8 13:57:21 mailgate postfix/smtpd[31451]: connect from outgoing.example.com[198.51.100.97]
Dec  8 13:57:21 mailgate postfix/smtpd[31451]: NOQUEUE: reject: RCPT from outgoing.example.com[198.51.100.97]: 450 4.7.1 <userA@mydomain.test>: Recipient address rejected: Service is unavailable (try later); from=<userB@mydomain.test> to=<userA@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 13:57:21 mailgate postfix/smtpd[31451]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=0/1 rset=1 quit=1 commands=4/5

Dec  8 14:03:40 mailgate postfix/smtpd[31568]: connect from outgoing.example.com[198.51.100.97]
Dec  8 14:03:40 mailgate postfix/smtpd[31568]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 14:05:20 mailgate postfix/smtpd[31568]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<userB@mydomain.test> to=<userA@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 14:05:20 mailgate postfix/smtpd[31568]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=0/1 quit=1 commands=4/5
Dec  8 14:05:26 mailgate postfix/smtpd[31661]: CCFBC20F2B: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]

Dec  8 14:13:39 mailgate postfix/smtpd[31713]: connect from outgoing.example.com[198.51.100.97]
Dec  8 14:13:39 mailgate postfix/smtpd[31713]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 14:13:52 mailgate postfix/smtpd[31741]: D1D1120A86: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 14:13:52 mailgate postfix/smtpd[31713]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (20A3E5DECF70DC4572); from=<userB@mydomain.test> to=<userA@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 14:13:52 mailgate postfix/smtpd[31713]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

Dec  8 17:13:58 mailgate postfix/smtpd[1647]: connect from outgoing.example.com[198.51.100.97]
Dec  8 17:13:58 mailgate postfix/smtpd[1647]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 17:13:59 mailgate postfix/smtpd[1658]: 88D2220F2E: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 17:13:59 mailgate postfix/smtpd[1647]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (20F255DED2146403BE); from=<user1@example.com> to=<userB@mydomain.test> proto=ESMTP helo=<outgoing.example.com>
Dec  8 17:13:59 mailgate postfix/smtpd[1647]: disconnect from outgoing.example.com[198.51.100.97] ehlo=2 starttls=1 mail=1 rcpt=1 data=1 quit=1 commands=7

Dec  8 17:30:21 mailgate postfix/smtpd[1792]: connect from outgoing.example.com[198.51.100.97]
Dec  8 17:30:21 mailgate postfix/smtpd[1792]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 17:30:22 mailgate postfix/smtpd[1835]: 5E67620A3E: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 17:30:22 mailgate postfix/smtpd[1792]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (206835DED251D70061); from=<user2@example.com> to=<userB@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 17:30:22 mailgate postfix/smtpd[1792]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

Dec  8 18:51:02 mailgate postfix/smtpd[2750]: connect from outgoing.example.com[198.51.100.97]
Dec  8 18:51:02 mailgate postfix/smtpd[2750]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 18:51:03 mailgate postfix/smtpd[2783]: E1A1F20AAA: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 18:51:04 mailgate postfix/smtpd[2750]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (20AA35DED380691DDD); from=<user3@example.com> to=<userB@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 18:51:04 mailgate postfix/smtpd[2750]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

Dec  8 23:11:36 mailgate postfix/smtpd[6100]: connect from outgoing.example.com[198.51.100.97]
Dec  8 23:11:36 mailgate postfix/smtpd[6100]: NOQUEUE: reject: RCPT from outgoing.example.com[198.51.100.97]: 450 4.7.1 <userB@mydomain.test>: Recipient address rejected: Service is unavailable (try later); from=<user4@example.com> to=<userB@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 23:11:36 mailgate postfix/smtpd[6100]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=0/1 rset=1 quit=1 commands=4/5

Dec  8 23:20:44 mailgate postfix/smtpd[6238]: connect from outgoing.example.com[198.51.100.97]
Dec  8 23:20:44 mailgate postfix/smtpd[6238]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 23:20:56 mailgate postfix/smtpd[6246]: 8783820A50: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 23:20:56 mailgate postfix/smtpd[6238]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (206835DED7746BFAF1); from=<user4@example.com> to=<userB@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 23:20:56 mailgate postfix/smtpd[6238]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5

Dec  8 23:36:16 mailgate postfix/smtpd[6341]: connect from outgoing.example.com[198.51.100.97]
Dec  8 23:36:16 mailgate postfix/smtpd[6341]: NOQUEUE: client=outgoing.example.com[198.51.100.97]
Dec  8 23:36:18 mailgate postfix/smtpd[6356]: 9323020A5A: client=localhost.localdomain[127.0.0.1], orig_client=outgoing.example.com[198.51.100.97]
Dec  8 23:36:18 mailgate postfix/smtpd[6341]: proxy-accept: END-OF-MESSAGE: 250 2.5.0 OK (20A505DED7AE1020D9); from=<user5@example.com> to=<userB@mydomain.test> proto=SMTP helo=<outgoing.example.com>
Dec  8 23:36:18 mailgate postfix/smtpd[6341]: disconnect from outgoing.example.com[198.51.100.97] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
 
which parts do you want? what time-frame? which services?
i already attached an anoymized 2-line excerpt to the original question
I think that the units 'pmg-smtp-filter' and 'postfix@-' should be enough
roughly 10 minutes before the problem arises and 10 minutes after it's gone
e.g.
Code:
journalctl -u pmg-smtp-filter -u postfix@- --since '2019-12-08 13:50:00' --until '2019-12-08 14:15:00'

i don't think so.
on one day i got a total of 9 connections from a given remote mailserver, the first one was rejected (**450**) for greylisting, when they tried again 5 minutes lates they were rejected (**451**) because of the proxy timeout, and the next time they tried was 8 minutes later (success with **250**).

here's the anonymized log (grep outgoing.example.com)
You could also try limiting the number of smptd processes on port 25 - use the number that was uses for the lmtp process in the after-queue filtering
configuration (this should be the number of pmg-smtp-filter children - it is logged if you restart pmg-smtp-filter)