[SOLVED] Some nodes do not send mail correctly

linkstat

Renowned Member
Mar 15, 2015
38
19
73
Córdoba, Argentina
Hello.

I have a 7-node cluster, with the same Postfix configuration (identical /etc/postfix/main.cf except for the mydestination line, where, among other things, the corresponding hostname is specified).

Mail for root@pam is correctly configured from the Web GUI, and I can verify that mail is configured from bash:
Bash:
cat /etc/pve/user.cfg | grep root@pam
user:root@pam:1:0:::my.address@some.mail.com:::
Furthermore, if on each node, I run:
Bash:
echo "Test from PVE Node: $(hostname)" | /usr/bin/pvemailforward
the mails are dispatched correctly.

But if instead, the command to execute is:
Bash:
echo "Another test from PVE Node: $(hostname)" | mail -s "A test from $(hostname)" root

four of seven nodes correctly send the mail to the destination. But the other three nodes try to send to root@nodename.localdomain and in GMail I get a Delivery Status Notification (Failure) from Mail Delivery Subsystem <mailer-daemon@googlemail.com>

I thought it might be the Debian alternatives configuration, but on all nodes, the configuration is the same:
Bash:
ls -lh /usr/bin/mail
lrwxrwxrwx 1 root root 22 Jun 16 2015 /usr/bin/mail -> /etc/alternatives/mail

ls -lh /etc/alternatives/mail
lrwxrwxrwx 1 root root 18 Jun 16 2015 /etc/alternatives/mail -> /usr/bin/bsd-mailx

Finally, the .forward file exists in the root directory of all nodes, and the content is the same:
Bash:
cat /root/.forward
|/usr/bin/pvemailforward

And since I don't know what else to look at, I'm here for help, so they can give me a hand.

Thank you!!!
 
But if instead, the command to execute is:
Bash:
echo "Another test from PVE Node: $(hostname)" | mail -s "A test from $(hostname)" root
On a hunch the issue might be:
except for the mydestination line, where, among other things, the corresponding hostname is specified).
do all nodes have correct entries in their /etc/hosts (pointing the name in mydestination to an ip on the host)?

to say more - we'd need the journal from one node where the forwarding works, and from one where it doesn't
(just starting with the first line from a postfix process)
 
Hi Stoiko Ivanov.
in the /etc/postfix/main.cf the lines change only the hostname across nodes, for example:

PVE Node 1 (sends mail correctly):
Code:
myhostname=node-a.mydom.local
mydestination = $myhostname, node-a.mydom.local, localhost.mydom.local, localhost

PVE Node 2 (does not send emails correctly):
Code:
myhostname=node-b.mydom.local
mydestination = $myhostname, node-b.mydom.local, localhost.mydom.local, localhost

The /etc/hosts files are practically the same on all nodes (except for the specific node name itself). For example:
PVE Node 1:
Code:
127.0.0.1        localhost.localdomain      localhost
10.4.44.1        node-a.mydom.local       node-a        pvelocalhost

# ProxmoxVE Cluster
10.4.44.2        node-b.mydom.local   node-b
10.4.44.3        node-c.mydom.local   node-c
...

PVE Node 2:
Code:
127.0.0.1        localhost.localdomain      localhost
10.4.44.2        node-b.mydom.local       node-b        pvelocalhost

# ProxmoxVE Cluster
10.4.44.1        node-a.mydom.local   node-a
10.4.44.3        node-c.mydom.local   node-c
...

and so...

Regarding the log emails, they would be these:
First, run the mail command on each node:
echo "Mail de prueba desde PVE Node: $(hostname)" | mail -s "PVE Mail Test desde nodo $(hostname)" root
and then, we see the mail logs cat /var/log/mail.log
(do not pay attention to the "Network is unreachable" of some hosts, it is because they first try to connect via IPv6 instead of IPv4)

node-a (sends mails correctly):
Code:
Apr 21 08:19:14 node-a postfix/pickup[687772]: 268534C0817: uid=0 from=<root>
Apr 21 08:19:14 node-a postfix/cleanup[692401]: 268534C0817: message-id=<20220421111914.268534C0817@node-a.mydom.lan>
Apr 21 08:19:14 node-a postfix/qmgr[3339358]: 268534C0817: from=<root@node-a.mydom.lan>, size=469, nrcpt=1 (queue active)
Apr 21 08:19:14 node-a postfix/pickup[687772]: D3CF64C084F: uid=65534 from=<root>
Apr 21 08:19:14 node-a postfix/cleanup[692401]: D3CF64C084F: message-id=<20220421111914.268534C0817@node-a.mydom.lan>
Apr 21 08:19:14 node-a postfix/qmgr[3339358]: D3CF64C084F: from=<root@node-a.mydom.lan>, size=651, nrcpt=1 (queue active)
Apr 21 08:19:14 node-a postfix/local[692404]: 268534C0817: to=<root@node-a.mydom.lan>, orig_to=<root>, relay=local, delay=0.73, delays=0.03/0.01/0/0.69, dsn=2.0.0, status=sent (delivered to command: /usr/bin/pvemailforward)
Apr 21 08:19:14 node-a postfix/qmgr[3339358]: 268534C0817: removed
Apr 21 08:19:19 node-a postfix/smtp[692413]: D3CF64C084F: to=<my.address@gmail.com>, relay=smtp.gmail.com[64.233.190.109]:587, delay=4.2, delays=0/0.04/1.4/2.7, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539958 t3-20020a4a7603000000b0033a53c11f82sm3704887ooc.20 - gsmtp)
Apr 21 08:19:19 node-a postfix/qmgr[3339358]: D3CF64C084F: removed



node-b (does not send mail correctly):
Code:
Apr 21 08:19:14 node-b postfix/pickup[2045673]: 214F1140B24: uid=0 from=<root>
Apr 21 08:19:14 node-b postfix/cleanup[2047938]: 214F1140B24: message-id=<20220421111914.214F1140B24@node-a.mydom.lan>
Apr 21 08:19:14 node-b postfix/qmgr[416655]: 214F1140B24: from=<root@node-b.mydom.lan>, size=485, nrcpt=1 (queue active)
Apr 21 08:19:18 node-b postfix/smtp[2047940]: 214F1140B24: to=<root@node-b.mydom.lan>, orig_to=<root>, relay=smtp.gmail.com[64.233.190.109]:587, delay=3.9, delays=0.03/0.02/2/1.8, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539957 r35-20020a056870582300b000df0dc42ff5sm1005008oap.0 - gsmtp)
Apr 21 08:19:18 node-b postfix/qmgr[416655]: 214F1140B24: removed

node-c (does not send mail correctly):
Code:
Apr 21 08:19:14 node-c postfix/pickup[94338]: 45B5D9942F: uid=0 from=<root>
Apr 21 08:19:14 node-c postfix/cleanup[197254]: 45B5D9942F: message-id=<20220421111914.45B5D9942F@node-a.mydom.lan>
Apr 21 08:19:14 node-c postfix/qmgr[2807116]: 45B5D9942F: from=<root@node-c.mydom.lan>, size=483, nrcpt=1 (queue active)
Apr 21 08:19:16 node-c postfix/smtp[197257]: 45B5D9942F: to=<root@node-c.mydom.lan>, orig_to=<root>, relay=smtp.gmail.com[64.233.190.109]:587, delay=2.6, delays=0.04/0.01/1.6/1, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539956 n66-20020acabd45000000b002ef6c6992e8sm7212994oif.42 - gsmtp)
Apr 21 08:19:16 node-c postfix/qmgr[2807116]: 45B5D9942F: removed

node-d (does not send mail correctly):
Code:
Apr 21 08:19:14 node-d postfix/pickup[1303050]: 2A0884C05F5: uid=0 from=<root>
Apr 21 08:19:14 node-d postfix/cleanup[1306301]: 2A0884C05F5: message-id=<20220421111914.2A0884C05F5@node-a.mydom.lan>
Apr 21 08:19:14 node-d postfix/qmgr[2990464]: 2A0884C05F5: from=<root@node-d.mydom.lan>, size=473, nrcpt=1 (queue active)
Apr 21 08:19:14 node-d postfix/smtp[1306303]: connect to smtp.gmail.com[2800:3f0:4003:c01::6d]:587: Network is unreachable
Apr 21 08:19:16 node-d postfix/smtp[1306303]: 2A0884C05F5: to=<root@node-d.mydom.lan>, orig_to=<root>, relay=smtp.gmail.com[64.233.190.109]:587, delay=2.1, delays=0.07/0.01/1.2/0.86, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539956 u7-20020a4a85c7000000b0035c12c8be73sm549906ooh.29 - gsmtp)
Apr 21 08:19:16 node-d postfix/qmgr[2990464]: 2A0884C05F5: removed

node-e (sends mails correctly):
Code:
Apr 21 08:19:14 node-e postfix/pickup[758580]: 294F9320E25: uid=0 from=<root>
Apr 21 08:19:14 node-e postfix/cleanup[801867]: 294F9320E25: message-id=<20220421111914.294F9320E25@node-e.mydom.lan>
Apr 21 08:19:14 node-e postfix/qmgr[1606]: 294F9320E25: from=<root@node-e.mydom.lan>, size=481, nrcpt=1 (queue active)
Apr 21 08:19:14 node-e postfix/pickup[758580]: 6F393320E3A: uid=65534 from=<root>
Apr 21 08:19:14 node-e postfix/cleanup[801867]: 6F393320E3A: message-id=<20220421111914.294F9320E25@node-e.mydom.lan>
Apr 21 08:19:14 node-e postfix/qmgr[1606]: 6F393320E3A: from=<root@node-e.mydom.lan>, size=667, nrcpt=1 (queue active)
Apr 21 08:19:14 node-e postfix/local[801869]: 294F9320E25: to=<root@node-e.mydom.lan>, orig_to=<root>, relay=local, delay=0.3, delays=0.01/0/0/0.28, dsn=2.0.0, status=sent (delivered to command: /usr/bin/pvemailforward)
Apr 21 08:19:14 node-e postfix/qmgr[1606]: 294F9320E25: removed
Apr 21 08:19:14 node-e postfix/smtp[801890]: connect to smtp.gmail.com[2800:3f0:4003:c01::6c]:587: Network is unreachable
Apr 21 08:19:17 node-e postfix/smtp[801890]: 6F393320E3A: to=<my.address@gmail.com>, relay=smtp.gmail.com[64.233.190.109]:587, delay=2.7, delays=0/0.03/1.5/1.1, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539957 ga18-20020a056870ee1200b000e602e45cf8sm955005oab.42 - gsmtp)
Apr 21 08:19:17 node-e postfix/qmgr[1606]: 6F393320E3A: removed

node-f (sends mails correctly):
Code:
Apr 21 08:19:14 node-f postfix/pickup[2410053]: 2DEC5240A40: uid=0 from=<root>
Apr 21 08:19:14 node-f postfix/cleanup[2430213]: 2DEC5240A40: message-id=<20220421111914.2DEC5240A40@node-f.mydom.lan>
Apr 21 08:19:14 node-f postfix/qmgr[1707]: 2DEC5240A40: from=<root@node-f.mydom.lan>, size=493, nrcpt=1 (queue active)
Apr 21 08:19:14 node-f postfix/pickup[2410053]: 893AC240AA5: uid=65534 from=<root>
Apr 21 08:19:14 node-f postfix/cleanup[2430213]: 893AC240AA5: message-id=<20220421111914.2DEC5240A40@node-f.mydom.lan>
Apr 21 08:19:14 node-f postfix/qmgr[1707]: 893AC240AA5: from=<root@node-f.mydom.lan>, size=683, nrcpt=1 (queue active)
Apr 21 08:19:14 node-f postfix/local[2430215]: 2DEC5240A40: to=<root@node-f.mydom.lan>, orig_to=<root>, relay=local, delay=0.4, delays=0.03/0.01/0/0.36, dsn=2.0.0, status=sent (delivered to command: /usr/bin/pvemailforward)
Apr 21 08:19:14 node-f postfix/qmgr[1707]: 2DEC5240A40: removed
Apr 21 08:19:14 node-f postfix/smtp[2430219]: connect to smtp.gmail.com[2800:3f0:4003:c01::6c]:587: Network is unreachable
Apr 21 08:19:17 node-f postfix/smtp[2430219]: 893AC240AA5: to=<my.address@gmail.com>, relay=smtp.gmail.com[64.233.190.109]:587, delay=3.2, delays=0.01/0.03/1.5/1.6, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539957 g8-20020a056830160800b0060548e5f69csm5132865otr.2 - gsmtp)
Apr 21 08:19:17 node-f postfix/qmgr[1707]: 893AC240AA5: removed

node-g (sends mails correctly):
Code:
Apr 21 08:19:14 node-g postfix/pickup[1687848]: 2F22B21014: uid=0 from=<root>
Apr 21 08:19:14 node-g postfix/cleanup[1706426]: 2F22B21014: message-id=<20220421111914.2F22B21014@node-g.mydom.lan>
Apr 21 08:19:14 node-g postfix/qmgr[1704]: 2F22B21014: from=<root@node-g.mydom.lan>, size=479, nrcpt=1 (queue active)
Apr 21 08:19:14 node-g postfix/pickup[1687848]: 8165A21029: uid=65534 from=<root>
Apr 21 08:19:14 node-g postfix/cleanup[1706426]: 8165A21029: message-id=<20220421111914.2F22B21014@node-g.mydom.lan>
Apr 21 08:19:14 node-g postfix/qmgr[1704]: 8165A21029: from=<root@node-g.mydom.lan>, size=664, nrcpt=1 (queue active)
Apr 21 08:19:14 node-g postfix/local[1706428]: 2F22B21014: to=<root@node-g.mydom.lan>, orig_to=<root>, relay=local, delay=0.37, delays=0.03/0.01/0/0.33, dsn=2.0.0, status=sent (delivered to command: /usr/bin/pvemailforward)
Apr 21 08:19:14 node-g postfix/qmgr[1704]: 2F22B21014: removed
Apr 21 08:19:14 node-g postfix/smtp[1706432]: connect to smtp.gmail.com[2800:3f0:4003:c01::6c]:587: Network is unreachable
Apr 21 08:19:19 node-g postfix/smtp[1706432]: 8165A21029: to=<my.address@gmail.com>, relay=smtp.gmail.com[64.233.190.109]:587, delay=4.8, delays=0/0.03/1.9/2.9, dsn=2.0.0, status=sent (250 2.0.0 OK  1650539959 t22-20020a4a8256000000b003332a0402f5sm7745240oog.23 - gsmtp)
Apr 21 08:19:19 node-g postfix/qmgr[1704]: 8165A21029: removed

I observe that all the nodes that send mail correctly have a line that says delivered to command: /usr/bin/pvemailforward, conversely, hosts that do not forward mail correctly do not mention pvemailforward, and instead refer to node-a.

I think that's where the issue lies
Thanks for any help you can give me.
 
I think that's where the issue lies
sounds sensible
could you please diff the postfix configs (/etc/postfix/main.cf and /etc/postfix/master.cf) between 2 nodes (one where it works, one where it does not work)

Finally, the .forward file exists in the root directory of all nodes, and the content is the same:
I assume this is still the case and also that the files have the same permissions/owner/etc.?
 
Ok.
I see that the /etc/postfix/main.cf are the same on first four nodes are equals (only the node-a sends mails correctly, and the others three do not); the others last three nodes, have differences:
Code:
diff node-a/main.cf node-b/main.cf
(nothing)

diff node-a/main.cf node-c/main.cf
(nothing)

diff node-a/main.cf node-d/main.cf
(nothing)

diff node-a/main.cf node-e/main.cf
3c3
< myhostname=node-a.urgencias.local
---
> myhostname=node-e.urgencias.local
16c16
< mydestination = $myhostname, node-a.urgencias.local, localhost.urgencias.local, localhost
---
> mydestination = $myhostname, node-e.urgencias.local, localhost.urgencias.local, localhost


The /etc/postfix/master.cf diffs:
diff node-a/master.cf node-b/master.cf
11d10
< smtp inet n - - - - smtpd
29,30c28,30
< pickup fifo n - - 60 1 pickup
< cleanup unix n - - - 0 cleanup
---
> smtp inet n - y - - smtpd
> pickup fifo n - y 60 1 pickup
> cleanup unix n - y - 0 cleanup
33,39c33,39
< tlsmgr unix - - - 1000? 1 tlsmgr
< rewrite unix - - - - - trivial-rewrite
< bounce unix - - - - 0 bounce
< defer unix - - - - 0 bounce
< trace unix - - - - 0 bounce
< verify unix - - - - 1 verify
< flush unix n - - 1000? 0 flush
---
> tlsmgr unix - - y 1000? 1 tlsmgr
> rewrite unix - - y - - trivial-rewrite
> bounce unix - - y - 0 bounce
> defer unix - - y - 0 bounce
> trace unix - - y - 0 bounce
> verify unix - - y - 1 verify
> flush unix n - y 1000? 0 flush
42,43c42
< smtp unix - - - - - smtp
< relay unix - - - - - smtp
---
> smtp unix - - y - - smtp
45,48c44,48
< showq unix n - - - - showq
< error unix - - - - - error
< retry unix - - - - - error
< discard unix - - - - - discard
---
> relay unix - - y - - smtp
> showq unix n - y - - showq
> error unix - - y - - error
> retry unix - - y - - error
> discard unix - - y - - discard
51,53c51,52
< lmtp unix - - - - - lmtp
< anvil unix - - - - 1 anvil
< scache unix - - - - 1 scache
---
> lmtp unix - - y - - lmtp
> anvil unix - - y - 1 anvil
66a66
> scache unix - - y - 1 scache

On all nodes:
Code:
cat /root/.forward
|/usr/bin/pvemailforward


But finally, I was able to solve the problem, correcting the line with myhostname=... of the three nodes that were not sending mail.
this is one of those problems to which they apply the expression (in Spanish language): "the turtle escaped me"
I have a doubt though: shouldn't the $myhostname variable "cover" the hostname of the corresponding node?

Anyway, thank you very much for the help, Stoiko Ivanov.
 
  • Like
Reactions: Stoiko Ivanov

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!