Cluster config sync not applying

Richard Goode

New Member
May 29, 2019
15
2
3
52
Hi,

I'm running the latest 6.0 (just did a dist-upgrade today) and I've noticed a problem where I modify the config on node 1, the config synchronizes to node 2 but is not applied until I restart pmg-smtp-filter.

The particular changes I've been making are with the Mail Filter rules (adding headers, changing who object contents).

Is this normal? And if not, how do I debug?

I'm not sure if this problem was pre-upgrade or not. These 2 nodes were fresh 6.0 installs, not upgraded from 5.x.

After I make a change on node 1, I see this in the syslog of node 2 around 60 seconds later:

Nov 23 22:52:49 scrub2 pmgmirror[1014]: starting cluster syncronization
Nov 23 22:52:49 scrub2 pmgmirror[1014]: detected rule database changes - starting sync from '172.16.60.20'
Nov 23 22:52:49 scrub2 pmgmirror[1014]: finished rule database sync from host '172.16.60.20'
Nov 23 22:52:51 scrub2 pmgmirror[1014]: cluster syncronization finished (0 errors, 2.66 seconds (files 0.24, database 2.10, config 0.32))

But old config continues to apply until I restart pmg-smtp-filter.

Thanks,
Richard
 
TL;DR: It seems pmg-smtp-filter starts syncing correctly time-after-time once I manually send it a SIGUSR1 signal, but stops again after restart pmg-smtp-filter. Bug?


I've been doing a lot of testing and it seems this issue is intermittent. Throughout all my earlier tests (before I wrote the above), node 2 wasn't reloading its config. Then later that night it started to work (most-of) the time, and changes were applied immediately after the cluster sync.

Before I went to bed, I restarted pmg-smtp-filter on node 2, made some changes on node 1 (the master). By morning (+10 hours later), the change still hadn't applied to node 2, so it doesn't appear to be time related.

I had a look through the code and I see there are references to sending a SIGUSR1 signal to pmg-smtp-filter after the ruleDB is updated. So I tried manually sending SIGUSR1 to pmg-smtp-filter, and as expected, this caused node 2 to reload its config.

And indeed I see upon the next receipt of email at node 2:

Nov 24 12:15:12 scrub2 pmgpolicy[23918]: reloading configuration Proxmox_ruledb
Nov 24 12:15:13 scrub2 pmg-smtp-filter[21453]: reloading configuration Proxmox_ruledb

Then, from that point onwards, node 2 successfully reloads the ruleDB every time I make a change on the cluster. It seems as though once I manually send a SIGUSR1 to pmg-smtp-filter, from then onwards it works fine. Note - in subsequent tests, I didn't always / consistently see log entries "reloading configuration Proxmox_ruledb" - it was sporadic, however the reload was working.

I retested this. I restarted pmg-smtp-filter and found that it did not reload the ruleDB upon a change. Multiple attempts were made. Then I send pmg-smtp-filter a sig 10 and from that point forward it will reload ruleDB upon a change every time. Once I restart pmg-smtp-filter, it stops working again and the process repeats.

As a workaround, I could put in a cronjob to SIGUSR1 pmg-smtp-filter every minute, but I feel this is a bug.

Also worth noting. If I made a change locally on node 2, node 2 still behaves the same - i.e. doesn't reload ruleDB. Although this may need more testing as I wasn't thorough on diagnosing this scenario.

Rich
 
hmm - could you please post the output of `pmgversion -v` ?
We had a bug in that part, which should have been fixed in pmg-api >=6.0-6

I hope this helps!
 
Hi.

Here's from my 2 nodes:

Node 1:

root@scrub1:~# pmgversion -v
proxmox-mailgateway: 6.0-1 (API: 6.0-4/c6dd64ec, running kernel: 5.0.21-1-pve)
pmg-api: 6.0-4
pmg-gui: 2.0-4
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-5.0.21-1-pve: 5.0.21-1
libarchive-perl: 3.3.3-1
libjs-extjs: 6.0.1-10
libjs-framework7: 4.4.7-1
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-http-server-perl: 3.0-2
libxdgmime-perl: 0.01-5
lvm2: 2.03.02-3
pmg-docs: 6.0-3
proxmox-mini-journalreader: 1.1-1
proxmox-spamassassin: 3.4.2-11
proxmox-widget-toolkit: 2.0-7
pve-firmware: 3.0-2
pve-xtermjs: 3.13.2-1
zfsutils-linux: 0.8.1-pve2


Node 2:

proxmox-mailgateway: 6.0-1 (API: 6.0-4/c6dd64ec, running kernel: 5.0.21-1-pve)
pmg-api: 6.0-4
pmg-gui: 2.0-4
pve-kernel-5.0: 6.0-7
pve-kernel-helper: 6.0-7
pve-kernel-5.0.21-1-pve: 5.0.21-1
libarchive-perl: 3.3.3-1
libjs-extjs: 6.0.1-10
libjs-framework7: 4.4.7-1
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-4
libpve-http-server-perl: 3.0-2
libxdgmime-perl: 0.01-5
lvm2: 2.03.02-3
pmg-docs: 6.0-3
proxmox-mini-journalreader: 1.1-1
proxmox-spamassassin: 3.4.2-11
proxmox-widget-toolkit: 2.0-7
pve-firmware: 3.0-2
pve-xtermjs: 3.13.2-1
zfsutils-linux: 0.8.1-pve2

Seems I'm running API 6.0-4. Is this upgraded when I do a dist-upgrade, as I did this ~2 days ago.

Rich