Different response from members of the same cluster (Invalid EHLO/HELO domain)

jrendon

New Member
Jan 11, 2023
3
0
1
I have a cluster of two nodes, with which I am having a problem for certain IPs from which I need to receive emails, one of the nodes always rejects emails from those IPs with an error message, while the master node accepts emails without issue.

Even adding these IPs to the whitelist the error still occurs.
I had the problem in version 6 and I did a clean installation of version 7 restoring the backup, but the problem continues

I relate the records of two tests carried out both to the main node and to the secondary node, with the IP outside the whitelist and then including it in the whitelist


Telnet main node
Code:
telnet pmg.domain.com 25
Trying x.x.x.x...
Connected to pmg.domain.com.
Escape character is '^]'.
220-pmg.domain.com ESMTP Company
220 pmg.domain.com ESMTP Company
EHLO pve1611.corp.company2.com
250-pmg.domain.com
250-SIZE 52428800
250-VRFY
250-ETRN
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-DSN
250-SMTPUTF8
250 CHUNKING
MAIL FROM:<root@pve1611.corp.company2.com>
250 2.1.0 Ok

Bash:
Jan 11 13:25:23 pmg postfix/postscreen[1278508]: CONNECT from [x.x.x.x]:60194 to [x.x.x.x]:25
Jan 11 13:25:23 pmg postfix/dnsblog[1279768]: addr x.x.x.x listed by domain b.barracudacentral.org as 127.0.0.2
Jan 11 13:25:29 pmg postfix/postscreen[1278508]: DNSBL rank 1 for [x.x.x.x]:60194


Telnet secondary node
Code:
telnet pmg1.domain.com 25
Trying x.x.x.x...
Connected to pmg1.domain.com.
Escape character is '^]'.
220 pmg1.domain.com ESMTP Company
EHLO pve1611.corp.company2.com
250-pmg1.domain.com
250-SIZE 20971520
250-VRFY
250-ETRN
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-DSN
250 SMTPUTF8
MAIL FROM:<root@pve1611.corp.company2.com>
550 5.5.0 Invalid EHLO/HELO domain.

Bash:
Jan 11 13:29:54 pmg1 postfix/postscreen[163670]: CONNECT from [x.x.x.x]:32995 to [x.x.x.x]:25
Jan 11 13:29:54 pmg1 postfix/dnsblog[163714]: addr x.x.x.x listed by domain b.barracudacentral.org as 127.0.0.2
Jan 11 13:30:00 pmg1 postfix/postscreen[163670]: DNSBL rank 1 for [x.x.x.x]:32995


Telnet main node with whitelisted IP

Code:
telnet pmg.domain.com 25
Trying x.x.x.x...
Connected to pmg.domain.com.
Escape character is '^]'.
220 pmg.domain.com ESMTP Company
EHLO pve1611.corp.company2.com
250-pmg.domain.com
250-PIPELINING
250-SIZE 52428800
250-VRFY
250-ETRN
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-SMTPUTF8
250 CHUNKING
MAIL FROM:<root@pve1611.corp.company2.com>
250 2.1.0 Ok

Bash:
Jan 11 13:37:16 pmg postfix/postscreen[1296377]: CONNECT from [x.x.x.x]:60196 to [x.x.x.x]:25
Jan 11 13:37:16 pmg postfix/postscreen[1296377]: WHITELISTED [x.x.x.x]:60196
Jan 11 13:37:16 pmg postfix/smtpd[1296381]: warning: hostname xxxxx-xxxxxx-x-xx-xx.xxx.xxx does not resolve to address x.x.x.x
Jan 11 13:37:16 pmg postfix/smtpd[1296381]: connect from unknown[x.x.x.x]

Telnet secondary node with whitelisted IP
Code:
telnet pmg1.domain.com 25
Trying x.x.x.x...
Connected to pmg1.domain.com.
Escape character is '^]'.
220 pmg1.domain.com ESMTP Company
EHLO pve1611.corp.company2.com
250-pmg1.domain.com
250-SIZE 20971520
250-VRFY
250-ETRN
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 SMTPUTF8
MAIL FROM:<root@pve1611.corp.company2.com>
550 5.5.0 Invalid EHLO/HELO domain.

Bash:
Jan 11 13:43:27 pmg1 postfix/postscreen[188555]: CONNECT from [x.x.x.x]:27551 to [x.x.x.x]:25
Jan 11 13:43:27 pmg1 postfix/postscreen[188555]: WHITELISTED [x.x.x.x]:27551
Jan 11 13:43:27 pmg1 postfix/smtpd[188569]: warning: hostname xxxxx-xxxxxx-x-xx-xx.xxx.xxx does not resolve to address x.x.x.x
Jan 11 13:43:27 pmg1 postfix/smtpd[188569]: connect from unknown[x.x.x.x]
Jan 11 13:44:17 pmg1 postfix/smtpd[188569]: lost connection after EHLO from unknown[x.x.x.x]
Jan 11 13:44:17 pmg1 postfix/smtpd[188569]: disconnect from unknown[x.x.x.x] ehlo=1 commands=1


The postfix configuration files were compared and no difference is evident between the nodes, the postscreen cache was also cleared, but the problem continues.

Thank you for the help you could provide me
 
Last edited:
Jan 11 13:37:16 pmg postfix/smtpd[1296381]: warning: hostname xxxxx-xxxxxx-x-xx-xx.xxx.xxx does not resolve to address x.x.x.x
on a hunch - does dns resolution work on the second node at all? and if does it produce the same responses as on the working node?

especially regarding the reverse lookup for the IP

on a side-node - please use code-tags to paste commandline output and logs - it's a bit easier to read

I hope this helps!
 
Hello Stoiko,

Both nodes have the same DNS servers configured and the resolution works correctly.

Even as can be seen in the logs, both nodes generate the warning that the name does not match the IP that connects, so that would not be the reason for the rejection.

Bash:
Jan 11 13:37:16 pmg postfix/smtpd[1296381]: warning: hostname xxxxx-xxxxxx-x-xx-xx.xxx.xxx does not resolve to address x.x.x.x
Jan 11 13:43:27 pmg1 postfix/smtpd[188569]: warning: hostname xxxxx-xxxxxx-x-xx-xx.xxx.xxx does not resolve to address x.x.x.x

Thanks
 
Hmm - compare the:
* postfix config of both systems - main.cf and master.cf
* does the cluster sync work as expected? (check the journal)
* are there other messages in the log around the time the message got rejected that might indicate what's not working
* do you have any modification to the postfix config templates on this cluster?
 
As you can see there are no differences in the configuration files and there are no errors in the synchronization either; As for additional messages in the log related to said connection, I have not identified any additional and we do not have any modifications in the templates.

Bash:
diff main_pmg.cf main_pmg1.cf
23c23
< myhostname = pmg.domain.com
---
> myhostname = pmg1.domain.com

Bash:
diff master_pmg.cf master_pmg1.cf
93c93
<   -o mynetworks=127.0.0.0/8, x.x.x.x
---
>   -o mynetworks=127.0.0.0/8, x.x.x.x

Bash:
Jan 12 09:58:18 pmg pmgmirror[901879]: starting cluster synchronization
Jan 12 09:58:19 pmg pmgmirror[901879]: cluster synchronization finished  (0 errors, 0.58 seconds (files 0.36, database 0.22, config 0.00))

Bash:
Jan 12 09:59:12 pmg1 pmgmirror[26917]: starting cluster synchronization
Jan 12 09:59:17 pmg1 pmgmirror[26917]: cluster synchronization finished  (0 errors, 5.19 seconds (files 0.23, database 4.47, config 0.49))

Thanks.
 
Sorry - don't see what could be off - and if the postfix config is almost identical and your DNS setup is working I don't see why the 2 nodes should behave differently...
 
Hi!
I encountered the same problem and searched for a solution for a long time... thinking it was a DNS resolution issue.
Ultimately, it was a cluster member configuration problem. Emails were transferred to the internal port and not the external port of pmg!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!