Proxmox mailgateway on LXC memory usage

internbeheer · Jul 6, 2020

The OOM killer kills the CLAM daemon on the proxmox mail gateway.

Bash:

root@antispam1:~# dmesg -T | egrep -i 'killed process'
[Mon Jun 15 11:02:06 2020] Memory cgroup out of memory: Killed process 195913 (clamd) total-vm:1939452kB, anon-rss:1387168kB, file-rss:0kB, shmem-rss:0kB, UID:100108 pgtables:3328kB oom_score_adj:0
[Mon Jun 15 12:37:25 2020] Memory cgroup out of memory: Killed process 3915135 (clamd) total-vm:1652460kB, anon-rss:1224264kB, file-rss:0kB, shmem-rss:0kB, UID:100108 pgtables:2680kB oom_score_adj:0
[Mon Jun 15 13:16:24 2020] Memory cgroup out of memory: Killed process 780253 (clamd) total-vm:1504944kB, anon-rss:1204088kB, file-rss:0kB, shmem-rss:0kB, UID:100108 pgtables:2592kB oom_score_adj:0

The LXC container has 8Gb RAM which is twice the recommended amount. My problem is, Proxmox VE insists there is no memory shortage:

What causes proxmox mail gateway to use so much memory that the OOM killer is invoked? Why does proxmox VE not properly monitor the memory demand in LXC containers?

Stoiko Ivanov · Jul 6, 2020

anything relevant in the logs? usually PMG is quite happy with 2.5G ( 1.4 being used by clamd) - so that seems odd if the container indeed has 8 G of ram

clamd is chosen by the oom-killer because it is the single process using up most of the memory.

internbeheer · Jul 6, 2020

define relevant

. Here is syslog

Code:

Jun 15 10:58:45 antispam1 pmgmirror[430]: starting cluster syncronization
Jun 15 10:58:45 antispam1 pmgmirror[430]: cluster syncronization finished  (0 errors, 0.14 seconds (files 0.11, database 0.03, config 0.00))
Jun 15 10:59:11 antispam1 clamd[357]: LibClamAV Warning: cli_tnef: file truncated, returning CLEAN
Jun 15 11:00:03 antispam1 systemd[1]: Starting Hourly Proxmox Mail Gateway activities...
Jun 15 11:00:03 antispam1 systemd[1]: Started Session 24446 of user root.
Jun 15 11:00:03 antispam1 systemd[1]: session-24446.scope: Succeeded.
Jun 15 11:00:03 antispam1 systemd[1]: Started Session 24447 of user root.
Jun 15 11:00:03 antispam1 systemd[1]: session-24447.scope: Succeeded.
Jun 15 11:00:03 antispam1 systemd[1]: Reloading Proxmox Mail Gateway Policy Daemon.
Jun 15 11:00:03 antispam1 systemd[1]: Reloaded Proxmox Mail Gateway Policy Daemon.
Jun 15 11:00:06 antispam1 systemd[1]: pmg-hourly.service: Succeeded.
Jun 15 11:00:06 antispam1 systemd[1]: Started Hourly Proxmox Mail Gateway activities.
Jun 15 11:00:37 antispam1 freshclam[351]: Received signal: wake up
Jun 15 11:00:37 antispam1 freshclam[351]: ClamAV update process started at Mon Jun 15 11:00:37 2020
Jun 15 11:00:37 antispam1 freshclam[351]: Received signal: wake up
Jun 15 11:00:37 antispam1 freshclam[351]: ClamAV update process started at Mon Jun 15 11:00:37 2020
Jun 15 11:00:37 antispam1 freshclam[351]: WARNING: Your ClamAV installation is OUTDATED!
Jun 15 11:00:37 antispam1 freshclam[351]: WARNING: Local version: 0.102.2 Recommended version: 0.102.3
Jun 15 11:00:37 antispam1 freshclam[351]: DON'T PANIC! Read https://www.clamav.net/documents/upgrading-clamav
Jun 15 11:00:37 antispam1 freshclam[351]: Your ClamAV installation is OUTDATED!
Jun 15 11:00:37 antispam1 freshclam[351]: daily.cvd database is up to date (version: 25843, sigs: 2618912, f-level: 63, builder: raynman)
Jun 15 11:00:37 antispam1 freshclam[351]: main.cvd database is up to date (version: 59, sigs: 4564902, f-level: 60, builder: sigmgr)
Jun 15 11:00:37 antispam1 freshclam[351]: bytecode.cvd database is up to date (version: 331, sigs: 94, f-level: 63, builder: anvilleg)
Jun 15 11:00:37 antispam1 freshclam[351]: Local version: 0.102.2 Recommended version: 0.102.3
Jun 15 11:00:37 antispam1 freshclam[351]: safebrowsing.cvd database is up to date (version: 49191, sigs: 2213119, f-level: 63, builder: google)
Jun 15 11:00:37 antispam1 freshclam[351]: DON'T PANIC! Read https://www.clamav.net/documents/upgrading-clamav
Jun 15 11:00:37 antispam1 freshclam[351]: daily.cvd database is up to date (version: 25843, sigs: 2618912, f-level: 63, builder: raynman)
Jun 15 11:00:45 antispam1 pmgmirror[430]: starting cluster syncronization
Jun 15 11:00:45 antispam1 pmgmirror[430]: cluster syncronization finished  (0 errors, 0.15 seconds (files 0.11, database 0.03, config 0.00))
Jun 15 11:01:54 antispam1 pmgdaemon[484]: successful auth for user 'support@pmg'
Jun 15 11:02:03 antispam1 systemd[1]: Started Session 24448 of user root.
Jun 15 11:02:03 antispam1 systemd[1]: session-24448.scope: Succeeded.
Jun 15 11:02:03 antispam1 systemd[1]: Started Session 24449 of user root.
Jun 15 11:02:03 antispam1 systemd[1]: session-24449.scope: Succeeded.
Jun 15 11:02:05 antispam1 systemd[1]: clamav-daemon.service: Main process exited, code=killed, status=9/KILL
Jun 15 11:02:05 antispam1 systemd[1]: clamav-daemon.service: Failed with result 'signal'.
Jun 15 11:02:45 antispam1 pmgmirror[430]: starting cluster syncronization

Here is current memory usage (lxc container has now 12Gb mem):

Code:

top - 13:59:38 up 20 days, 23:17,  1 user,  load average: 3.97, 4.66, 4.70
Tasks: 139 total,   2 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  12288.0 total,   2203.0 free,   5118.2 used,   4966.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   7169.8 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    339 clamav    20   0 1978296   1.5g   9096 S   0.7  12.6 374:15.73 clamd
    479 root      20   0 1544376   1.4g  11704 S   0.0  11.4   1:46.52 pmgdaemon worke
    475 root      20   0  686692 575792  11540 S   0.0   4.6   1:12.56 pmgdaemon worke
     56 root      20   0  457228 382604 242788 S   0.0   3.0  31:50.85 systemd-journal
 954248 root      20   0  341716 262152  12212 S   0.0   2.1   0:04.95 pmg-smtp-filter
 954241 root      20   0  340804 261408  12240 S   8.6   2.1   0:06.14 pmg-smtp-filter
 954307 root      20   0  340232 260688  12352 S   0.0   2.1   0:03.41 pmg-smtp-filter
 954314 root      20   0  340200 260596  12208 S   0.0   2.1   0:03.41 pmg-smtp-filter
    433 root      20   0  327760 246996  10776 S   0.0   2.0   6:22.07 pmg-smtp-filter
 263571 www-data  20   0  285632 202352  12216 S   0.0   1.6   0:24.13 pmgproxy worker
 267742 www-data  20   0  241132 159252  12252 S   0.0   1.3   0:19.80 pmgproxy worker
    477 root      20   0  200616 118780  11340 S   0.0   0.9   0:57.65 pmgdaemon worke
    563 www-data  20   0  177048 111544  17552 S   0.0   0.9   0:21.92 pmgproxy
 824202 www-data  20   0  189376 108212  11904 S   0.0   0.9   0:02.85 pmgproxy worker

Stoiko Ivanov · Jul 6, 2020

hmm - the pmgdaemon worker using 1.4g is odd (at least I haven't seen something like it until now) - anything specific you do with the REST API?
how is the memory usage if you restart the pmgdaemon service (does it always climb to 1.4g)?

internbeheer · Jul 8, 2020

Bash:

root@antispam1:~# ps aux | grep pmgdaemon
root         474  0.0  0.7 176728 95156 ?        Ss   Jun15   0:22 pmgdaemon
root         475  0.0  4.5 686692 575956 ?       S    Jun15   1:18 pmgdaemon worker
root         477  0.0  0.9 200616 118780 ?       S    Jun15   1:02 pmgdaemon worker
root         479  0.0 11.4 1544376 1439520 ?     S    Jun15   1:51 pmgdaemon worker
root@antispam1:~# systemctl restart pmgdaemon
root@antispam1:~# ps aux | grep pmgdaemon
root     1499289  0.0  0.7 176740 96940 ?        Ss   16:07   0:00 pmgdaemon
root     1499290  0.0  0.7 177008 97576 ?        S    16:07   0:00 pmgdaemon worker
root     1499291  0.0  0.7 177008 97576 ?        S    16:07   0:00 pmgdaemon worker
root     1499292  0.0  0.7 177008 97576 ?        S    16:07   0:00 pmgdaemon worker

We don't do anything with a REST-API. Restarting the daemon helps freeing memory. I'll keep an eye on the pmgdaemon the next few days. Any thoughts about then proxmox pve not monitoring the memory demand of the LXC container?

Stoiko Ivanov · Jul 8, 2020

internbeheer said:
about then proxmox pve not monitoring the memory demand of the LXC container?

not sure what you mean by that?

internbeheer · Jul 8, 2020

Stoiko Ivanov said:
not sure what you mean by that?

See the graph in the first post. pve says RAM usage for the LXC container is 4GB; then why is the OOM killer started? You would expect pve to graph 100% RAM usage wouldn't you?

internbeheer · Jul 13, 2020

Code:

root@antispam1:~# ps aux | grep pmgdaemon
root     1499289  0.0  0.8 176796 107320 ?       Ss   Jul08   0:05 pmgdaemon
root     1687392  0.0  0.8 189548 107652 ?       S    Jul09   0:10 pmgdaemon worker
root     1687393  0.0  0.8 189384 107752 ?       S    Jul09   0:07 pmgdaemon worker
root     1687394  0.0  0.8 178024 102696 ?       S    Jul09   0:08 pmgdaemon worker

Stoiko Ivanov · Jul 13, 2020

so the memory usage remains sane on pmgdaemon - did you get another OOM kill ?

internbeheer said:
See the graph in the first post. pve says RAM usage for the LXC container is 4GB; then why is the OOM killer started?

it can happen that something inside the container started using much more memory in a very short timeframe - so that it did not get picked up by the graph...

if the OOM kills happen again - a bit of context of the logs could be helpful (i.e. more than just the OOM lines, and also the journal inside the container)

internbeheer · Jul 27, 2020

it can happen that something inside the container started using much more memory in a very short timeframe - so that it did not get picked up by the graph...

And what would that be? A big email? "It can happen something" does not help me building trust in this product.

bit of context of the logs could be helpful (i.e. more than just the OOM lines, and also the journal inside the container)

I posted a complete log; the postfix log messages are not relevant or suitable to be posted online.

BobhWasatch · Jul 27, 2020

I see you have a clamd running. That can use a lot of RAM when scanning big attachments.

internbeheer · Aug 6, 2020

I see you have a clamd running. That can use a lot of RAM when scanning big attachments.

Shouldn'd proxmox recommended memory settings be enough?

tom · Aug 6, 2020

internbeheer said:
Shouldn'd proxmox recommended memory settings be enough?

How big are your emails, did you set a reasonable message size limit?

Search

Search

Proxmox mailgateway on LXC memory usage

internbeheer

New Member

Stoiko Ivanov

Proxmox Staff Member

internbeheer

New Member

Stoiko Ivanov

Proxmox Staff Member

internbeheer

New Member

Stoiko Ivanov

Proxmox Staff Member

internbeheer

New Member

internbeheer

New Member

Stoiko Ivanov

Proxmox Staff Member

internbeheer

New Member

BobhWasatch

Famous Member

internbeheer

New Member

tom

Proxmox Staff Member

We value your privacy