Mail duplicated several times under heavy load

zerrac

New Member
Dec 19, 2022
3
0
1
Mail Gateway Version 7.2-1 (no licence) installed over a naked debian bullseye.
My company develops and hosts websites and we are considering using pmg as an outgoing mail gateway.

During our test we noticed that a lot of email was received 5 times or more coinciding with the following scheme in tracking Center :

Capture d’écran du 2022-12-19 17-00-07.png

The scenario is :
-> postfix on webserver send a mail using mailgateway as a relayhost
-> mail is processed by pmg-smtp-filter but due to heavy load it take >100s (default value for smtpd_proxy_timeout)
-> postfix on webserver receive an error :
Code:
Dec 15 17:25:24 ...-vm postfix/smtpd[20623]: 8E9AC1C354C: client=localhost[127.0.0.1]
Dec 15 17:25:24 ...-vm postfix/cleanup[16343]: 8E9AC1C354C: message-id=<167112152457.14933.8733748449407334134@xxxx>
Dec 15 17:25:24 ...-vm postfix/qmgr[7575]: 8E9AC1C354C: from=<xx@xxx.xx>, size=144776, nrcpt=1 (queue active)
Dec 15 17:29:53 ...-vm postfix/smtp[9291]: 8E9AC1C354C: hostxxxx said: 451 4.3.0 Error: queue file write error (in reply to end of DATA command)
-> pmg-smtp-filter processing continue in the background, ultimately injecting the mail in the outgoing queue
Code:
Dec 15 17:30:46 mailgateway pmg-smtp-filter[1383526]: 2A1A49639B4B1D0F270: accept mail to <xxx@xxx.fr> (33F8B2809E3) (rule: default-accept)
Dec 15 17:30:46 mailgateway pmg-smtp-filter[1383526]: 2A1A49639B4B1D0F270: processing time: 153.39 seconds (152.261, 0.425, 0)
-> postfix on the webserver retries to send the mail as it was deferred and every step are done again until the load decreases and pmg-smtp-filter is able to process the mail in less than 100s

I think pmg-smtp-filter should not inject mail in outgoing queue if the original email have already been deferred. I am pretty sure that the bug wont happen without before queue filtering but for several reasons i have to use it.

As a solution i raised smtpd_proxy_timeout to an absurd duration and added some CPUs to the VM.
 
Last edited:
could you please share the complete log of that timeframe? - this might help in getting a better overview

a 152 second delay for spam-checking is .. a bit odd - last time I saw something like that the reason was a broken DNS setup (make sure the first configured dns-server in /etc/resolv.conf works as expected)
 
I am testing mailgateway on a very little VM and sent hundreds of mail in less than a minute: The load was over 35 when the duplication happened. I know my VM is really undersized.

The point is, if pmg-smtp-filter process a mail for over 100s (for any reason) and ultimately accept the mail and inject it in the outgoing queue, mail could be send several times. (At least with before queue filtering as in this case the mail is deferred and the resend action is done by another postfix).

Maybe pmg-smtp-filter should have a timeout on its own, just a little under 100 sec to prevent this scenario ?
 
It's an edge-case I have not really encountered in the wild - but yes - you're right - with before-queue filtering it is possible to get such duplicate mails (which is at least a cosmetic issue)

If you want you can open an enhancement request at https://bugzilla.proxmox.com for such an addition

Still - the 100s seem odd for a mail - make sure your DNS setup is working
 
  • Like
Reactions: zerrac

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!