Hello,
In the pas few days out of nowhere PMG CPU just started to be overwhelmed, 100% most of the times. First I thought it was low on resources as I searched through the forum and somebody mentioned to check that. Server had 2 CPU and 4GB of RAM, so I double it. And after that it was fine for a short time, then it started again. I was again searching through forum before posting in order to find if somebody had similar issue.
What I noticed is that service pmg-smtp-filter is just spamming itself, child processes. At this moment systemctl status pmg-smtp-filter have 17 childs and htop says those are the ones that eats all the memory. If I restart it and check status again I see the following:
I tried to search in syslog for issues so I did
I also checked postgresql logs:
I then found some of errors from the logs are fixed in some of the previous version of the PMG, so I upgraded to the latest PMG 6.4-4 and issue again was solved for a short time, few hourse maybe. This morning when I came it was again for hourse 100% CPU and again pmg-smtp-filter.
Here is results of
EDIT:
I also did this morning
If there is something else I can check or provide please do tell me.
In the pas few days out of nowhere PMG CPU just started to be overwhelmed, 100% most of the times. First I thought it was low on resources as I searched through the forum and somebody mentioned to check that. Server had 2 CPU and 4GB of RAM, so I double it. And after that it was fine for a short time, then it started again. I was again searching through forum before posting in order to find if somebody had similar issue.
What I noticed is that service pmg-smtp-filter is just spamming itself, child processes. At this moment systemctl status pmg-smtp-filter have 17 childs and htop says those are the ones that eats all the memory. If I restart it and check status again I see the following:
Code:
May 20 09:50:05 pmg-server systemd[1]: Starting Proxmox SMTP Filter Daemon...
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Process Backgrounded
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: 2021/05/20-09:50:06 main (type Net::Server::PreFork) starting! pid(17063)
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Binding to TCP port 10023 on host 127.0.0.1 with IPv4
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Binding to TCP port 10024 on host 127.0.0.1 with IPv4
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Group Not Defined. Defaulting to EGID '0'
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: User Not Defined. Defaulting to EUID '0'
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Setting up serialization via flock
May 20 09:50:06 pmg-server pmg-smtp-filter[17063]: Filter daemon (re)started (max. 40 processes)
May 20 09:50:06 pmg-server systemd[1]: Started Proxmox SMTP Filter Daemon
I tried to search in syslog for issues so I did
cat /var/log/syslog | grep WARNING
and find couple of interesting messages but not sure what to do with them or if they are related to the issue:
Code:
pmgpolicy[13441]: WARNING: Pid_file created by this same process. Doing nothing.
pmg-smtp-filter[5916]: WARNING: Use of uninitialized value $score in concatenation (.) or string at /usr/share/perl5/PMG/RuleDB/Spam.pm line 432.
pmg-smtp-filter[5916]: WARNING: Use of uninitialized value in sprintf at /usr/bin/pmg-smtp-filter line 210.
pmg-smtp-filter[5756]: WARNING: rules: failed to run PYZOR_CHECK test, skipping:#012#011(refcnt: fd -1 < 0#012)
I also checked postgresql logs:
2021-05-19 13:04:59.724 CEST [30795] LOG: received fast shutdown request
2021-05-19 13:04:59.760 CEST [30795] LOG: aborting any active transactions
2021-05-19 13:04:59.760 CEST [1836] root@Proxmox_ruledb FATAL: terminating connection due to administrator command
2021-05-19 13:04:59.788 CEST [30795] LOG: background worker "logical replication launcher" (PID 30802) exited with exit code 1
2021-05-19 13:04:59.827 CEST [30797] LOG: shutting down
2021-05-19 13:05:00.099 CEST [30795] LOG: database system is shut down
2021-05-19 13:07:20.793 CEST [580] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 13:07:20.810 CEST [580] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 13:07:21.114 CEST [596] LOG: database system was shut down at 2021-05-19 13:05:00 CEST
2021-05-19 13:07:21.162 CEST [580] LOG: database system is ready to accept connections
2021-05-19 13:07:21.713 CEST [648] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 13:07:24.909 CEST [679] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 13:07:24.909 CEST [679] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:04:29.700 CEST [580] LOG: received fast shutdown request
2021-05-19 17:04:29.713 CEST [580] LOG: aborting any active transactions
2021-05-19 17:04:29.719 CEST [580] LOG: background worker "logical replication launcher" (PID 627) exited with exit code 1
2021-05-19 17:04:29.719 CEST [622] LOG: shutting down
2021-05-19 17:04:29.914 CEST [580] LOG: database system is shut down
2021-05-19 17:06:44.639 CEST [598] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 17:06:44.651 CEST [598] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 17:06:45.094 CEST [662] LOG: database system was shut down at 2021-05-19 17:04:29 CEST
2021-05-19 17:06:45.145 CEST [598] LOG: database system is ready to accept connections
2021-05-19 17:06:45.693 CEST [677] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 17:06:50.338 CEST [701] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:06:50.338 CEST [701] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:10:07.017 CEST [598] LOG: received fast shutdown request
2021-05-19 17:10:07.029 CEST [598] LOG: aborting any active transactions
2021-05-19 17:10:07.038 CEST [598] LOG: background worker "logical replication launcher" (PID 668) exited with exit code 1
2021-05-19 17:10:07.038 CEST [663] LOG: shutting down
2021-05-19 17:10:07.447 CEST [598] LOG: database system is shut down
2021-05-19 17:10:24.756 CEST [593] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 17:10:24.849 CEST [593] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 17:10:24.973 CEST [611] LOG: database system was shut down at 2021-05-19 17:10:07 CEST
2021-05-19 17:10:25.123 CEST [593] LOG: database system is ready to accept connections
2021-05-19 17:10:25.676 CEST [669] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 17:10:28.498 CEST [698] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:10:28.498 CEST [698] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:17:42.813 CEST [11799] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:17:42.813 CEST [11799] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:18:50.364 CEST [13235] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:18:50.364 CEST [13235] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 13:04:59.760 CEST [30795] LOG: aborting any active transactions
2021-05-19 13:04:59.760 CEST [1836] root@Proxmox_ruledb FATAL: terminating connection due to administrator command
2021-05-19 13:04:59.788 CEST [30795] LOG: background worker "logical replication launcher" (PID 30802) exited with exit code 1
2021-05-19 13:04:59.827 CEST [30797] LOG: shutting down
2021-05-19 13:05:00.099 CEST [30795] LOG: database system is shut down
2021-05-19 13:07:20.793 CEST [580] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 13:07:20.810 CEST [580] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 13:07:21.114 CEST [596] LOG: database system was shut down at 2021-05-19 13:05:00 CEST
2021-05-19 13:07:21.162 CEST [580] LOG: database system is ready to accept connections
2021-05-19 13:07:21.713 CEST [648] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 13:07:24.909 CEST [679] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 13:07:24.909 CEST [679] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:04:29.700 CEST [580] LOG: received fast shutdown request
2021-05-19 17:04:29.713 CEST [580] LOG: aborting any active transactions
2021-05-19 17:04:29.719 CEST [580] LOG: background worker "logical replication launcher" (PID 627) exited with exit code 1
2021-05-19 17:04:29.719 CEST [622] LOG: shutting down
2021-05-19 17:04:29.914 CEST [580] LOG: database system is shut down
2021-05-19 17:06:44.639 CEST [598] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 17:06:44.651 CEST [598] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 17:06:45.094 CEST [662] LOG: database system was shut down at 2021-05-19 17:04:29 CEST
2021-05-19 17:06:45.145 CEST [598] LOG: database system is ready to accept connections
2021-05-19 17:06:45.693 CEST [677] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 17:06:50.338 CEST [701] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:06:50.338 CEST [701] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:10:07.017 CEST [598] LOG: received fast shutdown request
2021-05-19 17:10:07.029 CEST [598] LOG: aborting any active transactions
2021-05-19 17:10:07.038 CEST [598] LOG: background worker "logical replication launcher" (PID 668) exited with exit code 1
2021-05-19 17:10:07.038 CEST [663] LOG: shutting down
2021-05-19 17:10:07.447 CEST [598] LOG: database system is shut down
2021-05-19 17:10:24.756 CEST [593] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-05-19 17:10:24.849 CEST [593] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-05-19 17:10:24.973 CEST [611] LOG: database system was shut down at 2021-05-19 17:10:07 CEST
2021-05-19 17:10:25.123 CEST [593] LOG: database system is ready to accept connections
2021-05-19 17:10:25.676 CEST [669] [unknown]@[unknown] LOG: incomplete startup packet
2021-05-19 17:10:28.498 CEST [698] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:10:28.498 CEST [698] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:17:42.813 CEST [11799] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:17:42.813 CEST [11799] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
2021-05-19 17:18:50.364 CEST [13235] root@Proxmox_ruledb ERROR: constraint "localstat_time_key" of relation "localstat" does not exist
2021-05-19 17:18:50.364 CEST [13235] root@Proxmox_ruledb STATEMENT: ALTER TABLE LocalStat DROP CONSTRAINT localstat_time_key
I then found some of errors from the logs are fixed in some of the previous version of the PMG, so I upgraded to the latest PMG 6.4-4 and issue again was solved for a short time, few hourse maybe. This morning when I came it was again for hourse 100% CPU and again pmg-smtp-filter.
Here is results of
pmgversion -v
Code:
proxmox-mailgateway: 6.4-1 (API: 6.4-4/1c8a73c7, running kernel: 5.4.73-1-pve)
pmg-api: 6.4-4
pmg-gui: 2.4-2
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
clamav-daemon: 0.103.2+dfsg-0+deb10u1
libarchive-perl: 3.3.3-1
libjs-extjs: 6.0.1-10
libjs-framework7: 4.4.7-1
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-http-server-perl: 3.1-1
libxdgmime-perl: 0.01-5
lvm2: 2.03.02-3
pmg-docs: 6.4-1
pmg-log-tracker: 2.1.9-1
postgresql-11: 11.12-0+deb10u1
proxmox-mini-journalreader: 1.1-1
proxmox-spamassassin: 3.4.6-1
proxmox-widget-toolkit: 2.5-1
pve-firmware: 3.2-2
pve-xtermjs: 4.7.0-3
zfsutils-linux: 2.0.4-pve1
EDIT:
I also did this morning
pmgconfig sync --restart 1
and systemctl restart postfix
and for now it looks stable, but that could be temporerally only. And forgot to mention that we didn't made any specific changes that should cause this in the past days.If there is something else I can check or provide please do tell me.
Last edited: