Mail gateway cluster set up

Nov 25, 2020
7
0
1
Thank you.
I have PMG running as a single vm
my firewall routes all the mails to the PMG for scanning
Can you pls tell how load balancing can be done with the help of Mail gateway cluster
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
5,812
794
148
my firewall routes all the mails to the PMG for scanning
In that case you need to configure your firewall to route the mails to all PMG cluster-nodes (either via round-robin or as fallback).
How to do that depends on your firewall - you need to check the firewall manual.

In general setting 2 A DNS-entries for the same hostname could work

I hope this helps!
 
You will need two public IPs, one routed to each PMG. You will need an A record set up so you can reach the server by each name. Then you will set up the MX records to point to the same A (hostname) that you set up earlier. Both of the MX records must use the same priority. This is what will cause the sending servers to alternate sending to each server. This is what I do and it works pretty well.
 

rupertchandler

New Member
Nov 25, 2020
3
0
1
57
Hi,

Trying to build a cluster across 2 PMG's both running 6.4.4, master is established and fully setup gateway being used, second PMG is a fresh install.

When trying to join new machine in I'm getting:

~# pmgcm join x.x.x.x
Enter password: ************
cluster join failed: 500 Can't connect to x.x.x.x:8006 (hostname verification failed)

Wondering what the verification is looking for so I can add/adjust accordingly?

Cheers

Rup
 

rupertchandler

New Member
Nov 25, 2020
3
0
1
57
Hi,

Trying to build a cluster across 2 PMG's both running 6.4.4, master is established and fully setup gateway being used, second PMG is a fresh install.

When trying to join new machine in I'm getting:

~# pmgcm join x.x.x.x
Enter password: ************
cluster join failed: 500 Can't connect to x.x.x.x:8006 (hostname verification failed)

Wondering what the verification is looking for so I can add/adjust accordingly?

Cheers

Rup
Is this related to my gateways not having subscriptions as I'm using a 'free' mode?
 

rupertchandler

New Member
Nov 25, 2020
3
0
1
57
Check your /etc/hosts settings.

no.
Hi Tom,

etc/hosts

Master:

x.x.x.x gateway.foo.bar gateway

Slave:

x.x.x.x slave.foo.bar slave

Just simple normal hosts file relating machines ip to its full and short names.

Is this correct?

Rup
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
5,812
794
148
what's the output when you run:
Code:
openssl s_client -connect ip.address.of.second:8006
?
 

philippt

Member
Nov 21, 2018
10
2
8
71
Same problem here with 6.4-4. Logged in the GUI and both nodes (master/slave) said
"master ERROR: fingerprint '54:03:74:39:BD:CF:A4:C7:XXXXXX' not verified, abort!"
Deleted the slave from the master config, removed /etc/pmg/cluster.conf to re-join the cluster, but to no avail:

Code:
root@slave:/etc/pmg$ pmgcm join 136.243.xx.xx
Enter password: *************
cluster join failed: 500 Can't connect to 136.243.xx.xx:8006 (hostname verification failed)

root@slave:/etc/pmg$ cat /etc/hosts
195.201.xx.xx slave.domain.xx slave
127.0.0.1 localhost
136.243.xx.xx master.domain.xx domain

root@master:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
136.243.xx.xx master.domain.xx master
195.201.xx.xx slave.domain.xx slave

root@slave:/etc/pmg$ openssl s_client -connect 136.243.xx.xx:8006
CONNECTED(00000003)
Can't use SSL_get_servername
depth=2 C = US, O = Internet Security Research Group, CN = ISRG Root X1
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = R3
verify return:1
depth=0 CN = master.domain.xx
verify return:1
---
Certificate chain
 0 s:CN = master.domain.xx
   i:C = US, O = Let's Encrypt, CN = R3
 1 s:C = US, O = Let's Encrypt, CN = R3
   i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
 2 s:C = US, O = Internet Security Research Group, CN = ISRG Root X1
   i:O = Digital Signature Trust Co., CN = DST Root CA X3
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIGJTCC...
-----END CERTIFICATE-----
subject=CN = master.domain.xx

issuer=C = US, O = Let's Encrypt, CN = R3

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 5085 bytes and written 363 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 4096 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
---
Post-Handshake New Session Ticket arrived:
SSL-Session:
    Protocol  : TLSv1.3
    Cipher    : TLS_AES_256_GCM_SHA384
    Session-ID: 008A6AC3085518C83D804202CE6B1243F03FFFD27A5EB64A028F8C0FB11792A9
    Session-ID-ctx:
    Resumption PSK: FDBC41356D933548A5A0B452343A25CEDB6B2A707F810050048AC5478F1AE68618D529436E7E117DDDA1071C94E39879
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 7200 (seconds)
    TLS session ticket:
    0000 - 22 01 04 1a 44 60 6f 95-6e 9b 8c d4 af 27 c0 46   "...D`o.n....'.F
    0010 - 86 bd 57 ac 8a 13 f2 2c-15 e9 78 b4 72 a6 25 cf   ..W....,..x.r.%.
    0020 - 18 76 cb e8 87 e1 12 16-bd 74 1a 6d 71 60 b4 ec   .v.......t.mq`..
    0030 - 1b 61 51 6c e8 5a 05 05-12 98 b2 61 52 57 9f 99   .aQl.Z.....aRW..
    0040 - f5 48 6c 76 08 e3 85 ed-e4 c1 38 d1 5a 69 a8 26   .Hlv......8.Zi.&
    0050 - 6f b4 b2 c9 74 46 28 36-56 8b 45 e3 72 19 08 e6   o...tF(6V.E.r...
    0060 - 40 53 16 73 8c e2 3b b5-4f ed a6 0c f4 65 3b 43   @S.s..;.O....e;C
    0070 - 43 df 39 b8 74 51 08 cd-d0 01 73 bf ed ab 6e 69   C.9.tQ....s...ni
    0080 - 4d dd 09 2e a5 b5 6a 52-bd 88 5f 7c 61 b7 f5 99   M.....jR.._|a...
    0090 - b6 a6 0a a3 17 af 4f d7-cd 3f ae 6b 1e 7b b3 78   ......O..?.k.{.x
    00a0 - 09 23 ad d9 eb 18 b7 15-c4 97 6c d9 f8 69 1c e7   .#........l..i..
    00b0 - 54 80 52 fb 8e 4e 2e 36-55 02 28 a4 fb 82 83 ec   T.R..N.6U.(.....

    Start Time: 1624301936
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: no
    Max Early Data: 0
---
read R BLOCK
---
Post-Handshake New Session Ticket arrived:
SSL-Session:
    Protocol  : TLSv1.3
    Cipher    : TLS_AES_256_GCM_SHA384
    Session-ID: B37C389C30BCECA339E2BD0C4562D15F9A9A533D25B55B95E1706265E9BD5E24
    Session-ID-ctx:
    Resumption PSK: E0979A89CFB12E0EF597C982DC229EAA3E2FF09F987FBD583B5B62A543EE822004D41E0C751FACD6487F8FC7B22C8CE6
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 7200 (seconds)
    TLS session ticket:
    0000 - 22 01 04 1a 44 60 6f 95-6e 9b 8c d4 af 27 c0 46   "...D`o.n....'.F
    0010 - 74 5d a3 7b 95 3b 6b 9a-55 1d 45 3d 9f 85 b2 ff   t].{.;k.U.E=....
    0020 - f9 eb 74 e9 da f1 67 ff-80 0c 9c c4 f4 b5 e4 e6   ..t...g.........
    0030 - a1 60 3b f9 80 a7 12 25-93 1c fb 42 5b df bf 75   .`;....%...B[..u
    0040 - 7e f4 de 72 35 4c 33 33-de 89 90 1a 7e 3e f6 74   ~..r5L33....~>.t
    0050 - 6e 18 34 7a d6 06 b8 c8-1b d4 f9 96 e6 9e 13 bb   n.4z............
    0060 - c4 fd 39 c4 71 02 2a 6d-bd e5 f6 20 cb a1 5a ea   ..9.q.*m... ..Z.
    0070 - ea 8a 55 b3 9f ce 10 33-85 d0 32 43 08 c9 99 9a   ..U....3..2C....
    0080 - 18 d6 ac 16 84 17 1c a7-78 c3 49 b8 48 78 48 52   ........x.I.HxHR
    0090 - 75 df 48 1f 32 39 b0 1e-af cb b8 23 2e fd 8e 9e   u.H.29.....#....
    00a0 - c6 b5 c4 bd 94 42 25 6e-01 33 43 e8 59 b5 75 31   .....B%n.3C.Y.u1
    00b0 - 62 d4 b6 f4 71 ee 74 39-bc dd fb 7e 57 38 ce 60   b...q.t9...~W8.`

    Start Time: 1624301936
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: no
    Max Early Data: 0
---
read R BLOCK
^C
root@slave:/etc/pmg$

root@slave:/etc/pmg$ ssh 136.243.xx.xx
Linux janine 5.4.106-1-pve #1 SMP PVE 5.4.106-1 (Fri, 19 Mar 2021 11:08:47 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Jun 21 20:54:18 2021 from 195.201.xx.xx
root@master:~#
 

philippt

Member
Nov 21, 2018
10
2
8
71
Completely removing the cluster (i. e. removing /etc/pmg/cluster.conf from both master and node) and recreating the cluster from the UI solved the issue for me.
 
  • Like
Reactions: Dave.r

Dave.r

New Member
Dec 10, 2019
5
1
3
51
Completely removing the cluster (i. e. removing /etc/pmg/cluster.conf from both master and node) and recreating the cluster from the UI solved the issue for me.
Hi philippt ,
Thanks for sharing this , We had exactly the same problem today with two new servers and mail gateway 7.0.6 , doesn't matter what we did it simply didn't want to join the server02 to masternode , we have deleted all cluster config files and only from gui managed to join the second server to masternode , This must be a bug otherwise it doesn't explain why it doesn't work out of the box !!!

# pmgcm join masternode

error - > cluster join failed: 500 Can't connect to masternode:8006 (hostname verification failed)
Thanks
Dave
 
Last edited:

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
5,812
794
148
error - > cluster join failed: 500 Can't connect to masternode:8006 (hostname verification failed)
have you tried simply pasting the output of `pmgcm join-cmd` ? (run `pmgcm join-cmd` on the master node and paste it to the node you want to add)
 

Dave.r

New Member
Dec 10, 2019
5
1
3
51
Dear Stoiko,
Many thanks for your help , As mentioned we have removed the cluster config files on both servers and everything worked , But it is good to know that there is an alternative command line which we will test of course next time that we will encounter the issue .

Best Wishes ,
Dave
 
  • Like
Reactions: Stoiko Ivanov

philippt

Member
Nov 21, 2018
10
2
8
71
Hi philippt ,
Thanks for sharing this , We had exactly the same problem today with two new servers and mail gateway 7.0.6 , doesn't matter what we did it simply didn't want to join the server02 to masternode , we have deleted all cluster config files and only from gui managed to join the second server to masternode , This must be a bug otherwise it doesn't explain why it doesn't work out of the box !!!

# pmgcm join masternode

error - > cluster join failed: 500 Can't connect to masternode:8006 (hostname verification failed)
Thanks
Dave

@Dave.r: In my case, the error happened in an already existing cluster. I faced the - presumably - same issue recently (with PMG 7.0-7) and found https://forum.proxmox.com/threads/cluster-join-error-with-le-certificates.41525/ - it seems that due to Let's Encrypt being used, the fingerprint changes frequently. After updating the fingerprints on all my cluster nodes, everything ran smoothly again.
I am still wondering if this really needs to be done manually, but maybe this works for you as well.

Best
Philipp
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
5,812
794
148
I am still wondering if this really needs to be done manually, but maybe this works for you as well.
With PMG 6.4 it has built-in support for ACME (and Let's encrypt) - the built-in support also updates the fingerprints of the changed certificates automatically.
If you wish to keep your custom Let's Encrypt deployment you can use `pmgcm update-fingerprints` to update the fingerprints in a cluster:
https://pmg.proxmox.com/pmg-docs/pmg-admin-guide.html#sysadmin_certificate_management (specifically section 4.6.8)

I hope this helps!
 

philippt

Member
Nov 21, 2018
10
2
8
71
With PMG 6.4 it has built-in support for ACME (and Let's encrypt) - the built-in support also updates the fingerprints of the changed certificates automatically.
If you wish to keep your custom Let's Encrypt deployment you can use `pmgcm update-fingerprints` to update the fingerprints in a cluster:
https://pmg.proxmox.com/pmg-docs/pmg-admin-guide.html#sysadmin_certificate_management (specifically section 4.6.8)

I hope this helps!
Thanks for the hint, actually I *did* use the built-in support for ACME. Seems like something is going wrong then with updating the fingerprints. I will investigate further when it happens the next time.
Philipp
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
5,812
794
148
Thanks for the hint, actually I *did* use the built-in support for ACME. Seems like something is going wrong then with updating the fingerprints. I will investigate further when it happens the next time.
let us know if it happens again (and provide the logs of the update) - just open a new thread with the information

Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!