451 4.3.0 Error: queue file write error;

poetry

Active Member
May 28, 2020
206
63
33
Code:
Sep 21 13:10:46 server pmg-smtp-filter[97459]: Killing "1" children
Sep 21 13:10:47 server systemd[1]: Stopping Proxmox SMTP Filter Daemon...
Sep 21 13:10:47 server pmg-smtp-filter[97459]: 2021/09/21-13:10:47 Server closing!
Sep 21 13:10:47 server systemd[1]: pmg-smtp-filter.service: Succeeded.
Sep 21 13:10:47 server systemd[1]: Stopped Proxmox SMTP Filter Daemon.
Sep 21 13:10:47 server systemd[1]: pmg-smtp-filter.service: Consumed 24min 4.472s CPU time.
Sep 21 13:10:47 server systemd[1]: Starting Proxmox SMTP Filter Daemon...
Sep 21 13:10:48 server postfix/smtpd[100617]: warning: connect to proxy filter 127.0.0.1:10024: Connection refused
Sep 21 13:10:48 server postfix/smtpd[100617]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<sender@example.com> to=<reciver@example.com> proto=ESMTP helo=<outlook.com>
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Process Backgrounded
Sep 21 13:10:48 server pmg-smtp-filter[103908]: 2021/09/21-13:10:48 main (type Net::Server::PreFork) starting! pid(103908)
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Binding to TCP port 10023 on host 127.0.0.1 with IPv4
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Binding to TCP port 10024 on host 127.0.0.1 with IPv4
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Group Not Defined.  Defaulting to EGID '0'
Sep 21 13:10:48 server pmg-smtp-filter[103908]: User Not Defined.  Defaulting to EUID '0'
Sep 21 13:10:48 server postfix/cleanup[103151]: 1E0801219B7: message-id=<20210921111048.1E0801219B7@server.example.com>
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Setting up serialization via flock
Sep 21 13:10:48 server pmg-smtp-filter[103908]: Filter daemon (re)started (max. 40 processes)
Sep 21 13:10:48 server systemd[1]: Started Proxmox SMTP Filter Daemon.

We get this email to postmaster@server.example.com from MAILER-DAEMON@server.example.com

Code:
Transcript of session follows.

 Out: 220 server.example.com ESMTP server.example.com
 In:  EHLO outlook.com
 Out: 250-server.example.com
 Out: 250-PIPELINING
 Out: 250-SIZE 157286400
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-STARTTLS
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250 CHUNKING
 In:  STARTTLS
 Out: 220 2.0.0 Ready to start TLS
 In:  EHLO outlook.com
 Out: 250-server.example.com
 Out: 250-PIPELINING
 Out: 250-SIZE 157286400
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250 CHUNKING
 In:  MAIL FROM:<sender@example.com> SIZE=2063927
 Out: 250 2.1.0 Ok
 In:  RCPT TO:<replyto@example.com>
 Out: 250 2.1.5 Ok
 In:  RCPT TO:<replyto@example.com>
 Out: 250 2.1.5 Ok
 In:  BDAT 2046876 LAST
 Out: 451 4.3.0 Error: queue file write error
 In:  QUIT
 Out: 221 2.0.0 Bye


For other details, see the local mail logfile

I can provide more logs to staff if needed but can't post them in public as it hard to mask private information for large amount of logs.
 
Sep 21 13:10:48 server postfix/smtpd[100617]: warning: connect to proxy filter 127.0.0.1:10024: Connection refused
This line says, that postfix cannot pass the mail to pmg-smtp-filter

Sep 21 13:10:47 server systemd[1]: Stopping Proxmox SMTP Filter Daemon... Sep 21 13:10:47 server pmg-smtp-filter[97459]: 2021/09/21-13:10:47 Server closing! Sep 21 13:10:47 server systemd[1]: pmg-smtp-filter.service: Succeeded. Sep 21 13:10:47 server systemd[1]: Stopped Proxmox SMTP Filter Daemon.
these lines (which were printed before the connection failure) tell that pmg-smtp-filter was stopped (it seems by something/someone running `systemctl restart pmg-smtp-filter`)


question is what is restarting pmg-smtp-filter?
 
How can I find why postfix cannot pass the mail to pmg-smtp-filter?
This can be seen in this logline:
Sep 21 13:10:48 server postfix/smtpd[100617]: warning: connect to proxy filter 127.0.0.1:10024: Connection refused


can I see somewhere what is the reason for process restart?
IIRC this stopping by systemctl is not happening from PMG's side - do you have any own checks/automation/provisioning/cronjobs in place, which might be doing that?
 
I also encounter similar error today. Not sure it is due to system resources issue....

1632394387714.png

From Syslog
Code:
Sep 23 18:17:40 pmg pmgpolicy[1336]: starting policy database maintenance (greylist, rbl)
Sep 23 18:17:42 pmg pmgpolicy[1336]: found 1 expired mails in greylisting database
Sep 23 18:17:42 pmg pmgpolicy[1336]: end policy database maintenance (2272 ms, 162 ms)
Sep 23 18:18:10 pmg pmgdaemon[86129]: successful auth for user 'root@pam'
Sep 23 18:18:17 pmg pmgdaemon[89891]: successful auth for user 'root@pam'
Sep 23 18:18:27 pmg pmg-smtp-filter[101381]: starting database maintenance
Sep 23 18:18:27 pmg pmg-smtp-filter[101381]: end database maintenance (122 ms)
Sep 23 18:18:28 pmg pmgdaemon[86626]: marked quarantined mail 'C0R16281T200942690' as deleted (/var/spool/pmg/spam/2F/42579614C345362F2F)
Sep 23 18:18:29 pmg pmgdaemon[89891]: marked quarantined mail 'C0R16265T78035133' as deleted (/var/spool/pmg/spam/52/4257B614C28ED9D852)
Sep 23 18:18:30 pmg pmgdaemon[86129]: marked quarantined mail 'C0R16266T130884045' as deleted (/var/spool/pmg/spam/D9/4257D614C28EDB2ED9)
Sep 23 18:18:32 pmg pmgdaemon[86626]: marked quarantined mail 'C0R16255T26270923' as deleted (/var/spool/pmg/spam/2B/4256E614C23E286E2B)
Sep 23 18:18:34 pmg pmgdaemon[86626]: marked quarantined mail 'C0R16249T38854123' as deleted (/var/spool/pmg/spam/89/42565614C1E1CBEE89)
Sep 23 18:18:35 pmg pmgdaemon[89891]: marked quarantined mail 'C0R16247T243317363' as deleted (/var/spool/pmg/spam/53/41F09614C1D9E92753)
Sep 23 18:18:35 pmg pmgdaemon[86626]: marked quarantined mail 'C0R16245T132945257' as deleted (/var/spool/pmg/spam/43/42569614C1D1FB2843)
Sep 23 18:18:36 pmg pmgdaemon[89891]: marked quarantined mail 'C0R16242T247132372' as deleted (/var/spool/pmg/spam/AC/42542614C1A407ADAC)
Sep 23 18:18:37 pmg pmgdaemon[86129]: marked quarantined mail 'C0R16236T89185069' as deleted (/var/spool/pmg/spam/A0/42500614C171E5A2A0)
Sep 23 18:18:38 pmg postfix/smtpd[113635]: warning: timeout talking to proxy 127.0.0.1:10023
Sep 23 18:18:38 pmg postfix/smtpd[113635]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<user1@mydomain.com> to=<user2@senderdomain.com> proto=ESMTP helo=<mail.mydomain.com>
Sep 23 18:18:38 pmg pmgdaemon[86626]: marked quarantined mail 'C0R16205T223066751' as deleted (/var/spool/pmg/spam/E1/42521614C03EDE20E1)
Sep 23 18:18:38 pmg postfix/cleanup[113698]: 969BC402B0: message-id=<20210923101838.969BC402B0@pmg.mydomain.com>
Sep 23 18:18:38 pmg postfix/smtpd[113635]: disconnect from remote.mydomain.com[192.168.40.230] ehlo=1 mail=1 rcpt=1 data=0/1 quit=1 commands=4/5

From tracking center
Code:
Sep 23 18:16:57 pmg postfix/smtpd[113635]: connect from remote.mydomain.com[192.168.40.230]
Sep 23 18:16:57 pmg postfix/smtpd[113635]: NOQUEUE: client=remote.mydomain.com[192.168.40.230]
Sep 23 18:18:38 pmg postfix/smtpd[113635]: warning: timeout talking to proxy 127.0.0.1:10023
Sep 23 18:18:38 pmg postfix/smtpd[113635]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<user1@mydomain.com> to=<user2@senderdomain.com> proto=ESMTP helo=<mail.mydomain.com>
Sep 23 18:18:38 pmg postfix/smtpd[113635]: disconnect from remote.mydomain.com[192.168.40.230] ehlo=1 mail=1 rcpt=1 data=0/1 quit=1 commands=4/5

Code:
Transcript of session follows.

 Out: 220 pmg.mydomain.com ESMTP Proxmox
 In:  EHLO mail.mydomain.com
 Out: 250-pmg.mydomain.com
 Out: 250-PIPELINING
 Out: 250-SIZE 157286400
 Out: 250-VRFY
 Out: 250-ETRN
 Out: 250-ENHANCEDSTATUSCODES
 Out: 250-8BITMIME
 Out: 250-DSN
 Out: 250-SMTPUTF8
 Out: 250 CHUNKING
 In:  MAIL FROM:<user1@mydomain.com> SIZE=478040 RET=HDRS
     ENVID=xxx.614C5416.BA2:c1:983e:N
 Out: 250 2.1.0 Ok
 In:  RCPT TO:<user2@senderdomain.com> NOTIFY=SUCCESS,FAILURE
     ORCPT=rfc822;xxx-user2@senderdomain.com:1:1
 Out: 250 2.1.5 Ok
 In:  DATA
 Out: 354 End data with <CR><LF>.<CR><LF>
 Out: 451 4.3.0 Error: queue file write error
 In:  QUIT
 Out: 221 2.0.0 Bye


For other details, see the local mail logfile
 
Last edited:
Sep 23 18:18:38 pmg postfix/smtpd[113635]: warning: timeout talking to proxy 127.0.0.1:10023
seems pmg-smtp-filter took to long to analyze that mail (but please check the logs (of pmg-smtp-filter) before the ones you pasted and afterwards - there might be some hints what took so long)

since it looks as if it's an internal mail from the ports (10023 is used for outbound mail) - maybe you can find that particular mail and send it again - to see if it's reproducible

I hope this helps!
 
seems pmg-smtp-filter took to long to analyze that mail (but please check the logs (of pmg-smtp-filter) before the ones you pasted and afterwards - there might be some hints what took so long)
Pls refer the attached log files.
 

Attachments

  • mail.err.txt
    2.6 KB · Views: 2
  • mail.log
    18.7 KB · Views: 3
  • mail.warn.txt
    3.3 KB · Views: 2
the mail that caused the error needed more than 2.5 minutes to get scanned by clamav (which explains the timeout you saw):
Code:
Sep 23 18:19:34 pmg pmg-smtp-filter[113636]: 40154614C5419F341B: processing time: 156.345 seconds (2.808, 152.221, 0)
 
Hope I am not hijacking this post but I encounter few more similar case today at around 5:20PM.

Notice there is high IO delay around that time but I am not sure what causing the high usage. I am using the same spec all the while since PMG 6.x and now only notice this error after migrate to PMG 7.x.

I am running PMG on VMware 6.7 with Dell EQL PS4110 SAN storage with 10K SAS disk in RAID6.

Btw, will try out below postfix modification later.

https://lxadm.com/“Error:_queue_file_write_error”_with_Postfix_and_Amavis
 

Attachments

  • 1632479016197.png
    1632479016197.png
    50.8 KB · Views: 6
  • 1632479064411.png
    1632479064411.png
    74.4 KB · Views: 7
  • 1632479120714.png
    1632479120714.png
    16.2 KB · Views: 6
  • 1632479218271.png
    1632479218271.png
    60.1 KB · Views: 7
I am experiencing the same error again. The message that triggered this error was 8MB message with around 90 excel attachments I am guessing that clamav could not process the message and now we are getting delivery errors every 5 minutes. Tried restart, changing the values of clamav to higher numbers and lower and defaults it's all the same...

I have checked all the queues and flush them all but somehow the message is still trying to be delivered. Please help it's filling up mailboxes on our server...

This is where it started
Code:
Dec 6 09:01:35 postfix/smtpd[117713]: connect from server.example.com[1.2.3.4]
Dec 6 09:01:35 postfix/smtpd[117713]: Anonymous TLS connection established from server.example.com[1.2.3.4]: TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)
Dec 6 09:01:35 postfix/smtpd[117713]: NOQUEUE: client=server.example.com[1.2.3.4]
Dec 6 09:03:18 postfix/smtpd[117713]: warning: timeout talking to proxy 127.0.0.1:10024
Dec 6 09:03:18 postfix/smtpd[117713]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<sender@example.com> to=<reciver@example.com> proto=ESMTP helo=<outlook.com>
Dec 6 09:03:18 postfix/smtpd[117713]: disconnect from server.example.com[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=2 data=0/1 quit=1 commands=7/8

Dec 6 09:18:21 postfix/smtpd[118700]: connect from server.example.com[1.2.3.4]
Dec 6 09:18:21 postfix/smtpd[118700]: Anonymous TLS connection established from server.example.com[1.2.3.4]: TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)
Dec 6 09:18:21 postfix/smtpd[118700]: NOQUEUE: client=server.example.com[1.2.3.4]
Dec 6 09:20:02 postfix/smtpd[118700]: warning: timeout talking to proxy 127.0.0.1:10024
Dec 6 09:20:02 postfix/smtpd[118700]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<sender@example.com> to=<reciver@example.com> proto=ESMTP helo=<outlook.com>
Dec 6 09:20:02 postfix/smtpd[118700]: disconnect from server.example.com[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=2 data=0/1 quit=1 commands=7/8

Dec 6 09:35:02 postfix/smtpd[121966]: connect from server.example.com[1.2.3.4]
Dec 6 09:35:02 postfix/smtpd[121966]: Anonymous TLS connection established from server.example.com[1.2.3.4]: TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)
Dec 6 09:35:03 postfix/smtpd[121966]: NOQUEUE: client=server.example.com[1.2.3.4]
Dec 6 09:36:45 postfix/smtpd[121966]: warning: timeout talking to proxy 127.0.0.1:10024
Dec 6 09:36:45 postfix/smtpd[121966]: proxy-reject: END-OF-MESSAGE: 451 4.3.0 Error: queue file write error; from=<sender@example.com> to=<reciver@example.com> proto=ESMTP helo=<outlook.com>
Dec 6 09:36:45 postfix/smtpd[121966]: disconnect from server.example.com[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=2 data=0/1 quit=1 commands=7/8
 

Attachments

  • proxmox1.png
    proxmox1.png
    33.2 KB · Views: 9
Last edited:
I have checked all the queues and flush them all but somehow the message is still trying to be delivered. Please help it's filling up mailboxes on our server...
from the logs it seems that the server trying to relay the message to your PMG has the mail in it's queue (there's always a connect from outside in the logs you posted)

if it's indeed clamav you can disable scanning with clamav through /etc/pmg/pmg.conf
set 'clamav 0' in the admin section and restart pmg-smtp-filter

you should probably also get some more hints from the journal where this problem originates

I hope this helps!
 
  • Like
Reactions: poetry
from the logs it seems that the server trying to relay the message to your PMG has the mail in it's queue (there's always a connect from outside in the logs you posted)

if it's indeed clamav you can disable scanning with clamav through /etc/pmg/pmg.conf
set 'clamav 0' in the admin section and restart pmg-smtp-filter

you should probably also get some more hints from the journal where this problem originates

I hope this helps!

@Stoiko Ivanov thank you. I have temporary disabled clamav and I see that message is now processed OK and finally delivered. Will enable clamav back again.

Can we increase the processing timeout for clamav and pmg-smtp-filter? I have already tried the https://lxadm.com/“Error:_queue_file_write_error”_with_Postfix_and_Amavis tip from @hata_ph but that did not help...

I see this now in the logs on that message that was problematic even with disabled clamav it took so much time to process the message:
Code:
Dec 9 16:10:51 pmg-smtp-filter[1742]: 120CE061B21BDE09BD1: processing time: 156.951 seconds (155.621, 0, 0)
 
Last edited:
the 155.621 seconds are spent in SpamAssassin (and not spamassassin)- do you maybe have custom spamassassin rules (in my experience these do cause this problem sometimes) - else does DNS work?

From the looks of it the setting from @hata_ph should be the right one for before-queue filtering (which I guess you have enabled)

where did you set smtpd_proxy_timeout ? (it probably should be fine to set it in the main.cf.in template (and of course run pmgconfig sync afterwards and restart postfix)

I hope this helps!
 
the 155.621 seconds are spent in SpamAssassin (and not spamassassin)- do you maybe have custom spamassassin rules (in my experience these do cause this problem sometimes) - else does DNS work?

From the looks of it the setting from @hata_ph should be the right one for before-queue filtering (which I guess you have enabled)

where did you set smtpd_proxy_timeout ? (it probably should be fine to set it in the main.cf.in template (and of course run pmgconfig sync afterwards and restart postfix)

I hope this helps!
I do have custom spamassassin rules that I configured via pmg gui. I need them without them the detection is not good enough. DNS does work I might even setup dedicated DNS servers just for this server. Something to do in the future.

I set smtpd_proxy_timeout in the main.cf.in template and run pmgconfig sync afterwards and restart postfix as you said.

Will see the next time I get the same error.
 
I do have custom spamassassin rules that I configured via pmg gui.
custom rules (with custom matches on the body for example) cannot be configured via GUI - you can set the score of certain rules higher (or lower) - but unless the score was 0 (which disables a rule) before you changed it this should not change the run time ...

If you like you could send me the mail (as zipped eml ) and I could try to see if I can either reproduce it or if it's something local in your environment

would it be possible to share your custom rules here? (maybe I can see something which might be improved)

I set smtpd_proxy_timeout in the main.cf.in template and run pmgconfig sync afterwards and restart postfix as you said.

Will see the next time I get the same error.
yes please - keep us posted!
Thanks!
 
custom rules (with custom matches on the body for example) cannot be configured via GUI - you can set the score of certain rules higher (or lower) - but unless the score was 0 (which disables a rule) before you changed it this should not change the run time ...

If you like you could send me the mail (as zipped eml ) and I could try to see if I can either reproduce it or if it's something local in your environment

would it be possible to share your custom rules here? (maybe I can see something which might be improved)


yes please - keep us posted!
Thanks!
@Stoiko Ivanov when I try to send you a message I get this "You may not start a conversation with the following recipients: Stoiko Ivanov.". Are the conversations actually private? I would send you my pmg configuration and the message that started this error if the messages are private.

I mixed Custom Rules with Custom Scores. I only have Custom Scores for spamassassin. I try not to do any kind of modifications that would alter default configuration out of what the GUI can do for exactly that reason (you really don't know the consistences they will bring).
Thanks again.
 
hen I try to send you a message I get this "You may not start a conversation with the following recipients: Stoiko Ivanov.". Are the conversations actually private?
sorry for causing confusion - I meant via mail (s.ivanov _at_ proxmox.com) (as zip file)
 
  • Like
Reactions: poetry
Thanks for the data!

Hope you can reproduce the issue. Your hard work is appreciated!
not reproducible directly:
* took the mail and sent it through the default config - came through in less than 1-2 seconds
* same with your custom-score file

however when I set the spamsize to your value (which is > 100M ) - then scanning took 90 seconds (the difference of 70 seconds could be attributed to the different CPU (and my system not having anything else to do) )

Not sure if there is anything too robust we can do about that - large mails being run through the tons of regular expressions of spamassassin take their time (and answering anything else than 4xx would also be wrong, since the mail might get through during the next try)

* As a mitigation you could consider setting the spam-size lower (this might break DKIM detection in SpamAssassin - without the complete mail-body the body-hash will not match))

I'll think if there's something we could improve here (apart from aborting the scanning before we run into a timeout, which might provide a bit better diagnostics)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!