[LXC Stability] Cached memory leaks (killing processes (wtf!))

jinjer

Renowned Member
Oct 4, 2010
204
7
83
I am having stability issues with LXC containers after migration from OpenVZ

What happens is that when the memory is all used in a container, the OOM kicks in and kills processes (see attached example).

If I try to restart a killed process it will usually fail. A reboot of the container is necessary (as if there is a memory leak somewhere and memory is never freed).

This is a huge issue.

Code:
[1878424.274466] df_inode invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[1878424.274471] df_inode cpuset=15004 mems_allowed=0-1
[1878424.274477] CPU: 6 PID: 15634 Comm: df_inode Tainted: P           O    4.2.6-1-pve #1
[1878424.274479] Hardware name: Supermicro X8DTT-H/X8DTT-H, BIOS 2.1b       10/28/2011
[1878424.274481]  0000000000000000 000000001156d152 ffff880060a2fcb8 ffffffff818013d8
[1878424.274483]  0000000000000000 ffff8806c18c5780 ffff880060a2fd38 ffffffff817ff8fa
[1878424.274485]  ffff880328ce4378 ffff8806e685bc30 0000000000000015 0000000000000000
[1878424.274488] Call Trace:
[1878424.274498]  [<ffffffff818013d8>] dump_stack+0x45/0x57
[1878424.274500]  [<ffffffff817ff8fa>] dump_header+0xaf/0x238
[1878424.274505]  [<ffffffff81185713>] oom_kill_process+0x1e3/0x3c0
[1878424.274509]  [<ffffffff811f09f1>] mem_cgroup_oom_synchronize+0x531/0x600
[1878424.274515]  [<ffffffff811ecbe0>] ? mem_cgroup_css_online+0x250/0x250
[1878424.274517]  [<ffffffff81185ee3>] pagefault_out_of_memory+0x13/0x80
[1878424.274522]  [<ffffffff8106735f>] mm_fault_error+0x7f/0x160
[1878424.274524]  [<ffffffff81067823>] __do_page_fault+0x3e3/0x410
[1878424.274527]  [<ffffffff81067872>] do_page_fault+0x22/0x30
[1878424.274530]  [<ffffffff8180a2c8>] page_fault+0x28/0x30
[1878424.274532] Task in /lxc/15004 killed as a result of limit of /lxc/15004
[1878424.274536] memory: usage 655360kB, limit 655360kB, failcnt 28963
[1878424.274537] memory+swap: usage 917424kB, limit 917504kB, failcnt 646
[1878424.274538] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[1878424.274539] Memory cgroup stats for /lxc/15004: cache:568488KB rss:86872KB rss_huge:0KB mapped_file:24756KB dirty:0KB writeback:0KB swap:262064KB inactive_anon:354640KB active_anon:300608KB inactive_file:0KB active_file:0KB unevictable:0KB
[1878424.274551] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[1878424.274724] [13751]     0 13751     1311      704       5       2       61             0 systemd
[1878424.274726] [14338]     0 14338     2024      746       7       2       89         -1000 sshd
[1878424.274728] [14371]     0 14371      600      223       5       2       23             0 inetd
[1878424.274730] [14388]     0 14388      852      352       5       2       39             0 cron
[1878424.274732] [14421]     0 14421      940      437       5       2       33             0 systemd-logind
[1878424.274734] [14486]     0 14486     3790      481       7       2       57             0 monit
[1878424.274736] [14488]   105 14488     1309      427       6       2       75          -900 dbus-daemon
[1878424.274738] [14505]     0 14505     8003      594      11       2       87             0 rsyslogd
[1878424.274740] [14573]     0 14573      606      283       5       2        2             0 agetty
[1878424.274742] [14578]     0 14578      606      300       5       2        1             0 agetty
[1878424.274744] [14583]     0 14583      606      297       5       2        1             0 agetty
[1878424.274746] [14596]     0 14596     3041     1966      11       2      404             0 munin-node
[1878424.274794] [ 8519]     0  8519    13326    10593      28       2        0             0 systemd-journal
[1878424.274797] [10977]   104 10977    61849    19992      72       2        0             0 named
[1878424.274866] [15564]     0 15564     3041     1894      11       2      390             0 /usr/sbin/munin
[1878424.274871] [15634] 65534 15634     1487      990       7       2        0             0 df_inode
[1878424.274873] Memory cgroup out of memory: Kill process 10977 (named) score 87 or sacrifice child
[1878424.274954] Killed process 10977 (named) total-vm:247396kB, anon-rss:75876kB, file-rss:4092kB
 
Dietmar. I agree, when working properly.

However, something is leaking memory with LXC containers and they become memory starved. This never occured on the *SAME* containers when they were running under OpenVZ. They would run for excess of 6 months at a time with some of then not being reboot for over a year.

With this in mind, after migration to LXC, I have to reboot containers daily to get them to run for the next 24 hrs. This might be due to LXC, or not, but is surely a problem that might bite your commercial customers.

I am attaching a sample of a container, wich is only a DNS resolver. It only has bind running and the monitoring processes. This has 512MB ram + 128MB swap allocated. Currently there's 655M used with *no* running processes. I'm attaching a "ps faxl" and "/proc/meminfo" output.

Tell me if you need more info to diagnose before I reboot it.

Code:
dnsd:~# ps faxl
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0     1     0  20   0   5244  2940 ep_pol Ss   ?          0:14 /sbin/init
4     0   316     1  20   0   8096  2984 poll_s Ss   ?          0:03 /usr/sbin/sshd -D
4     0  1432   316  20   0  11468  4008 -      Ss   ?          0:00  \_ sshd: root@pts/2
4     0  1453  1432  20   0   3896  2328 wait   Ss   pts/2      0:00      \_ -bash
0     0  1459  1453  20   0   2932   552 -      R+   pts/2      0:00          \_ ps faxl
4     0   333     1  20   0   2400   892 sigsus Ss   ?          0:00 /usr/sbin/inetd -i
4     0   342     1  20   0   3408  1408 hrtime Ss   ?          0:01 /usr/sbin/cron -f
4     0   360     1  20   0   3760  1792 ep_pol Ss   ?          0:02 /lib/systemd/systemd-logind
1     0   395     1  20   0  15160  1924 hrtime Sl   ?          0:58 /usr/bin/monit -c /etc/monit/monitrc
4   105   397     1  20   0   5236  1820 ep_pol Ss   ?          0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
4     0   406     1  20   0  32012  2376 poll_s Ssl  ?          2:04 /usr/sbin/rsyslogd -n
4     0   443     1  20   0   2424  1132 wait_w Ss+  pts/1      0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt102
4     0   446     1  20   0   2424  1200 wait_w Ss+  pts/0      0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt102
4     0   449     1  20   0   2424  1188 wait_w Ss+  console    0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt102
4     0  1435     1  20   0   4604  2312 ep_pol Ss   ?          0:00 /lib/systemd/systemd --user
5     0  1443  1435  20   0   6092   900 sigtim S    ?          0:00  \_ (sd-pam)

Code:
# cat /proc/meminfo
MemTotal:         655360 kB
MemFree:            1504 kB
MemAvailable:       1504 kB
Buffers:               0 kB
Cached:           648760 kB
SwapCached:            0 kB
Active:          4613056 kB
Inactive:        1186108 kB
Active(anon):    3609572 kB
Inactive(anon):  1037672 kB
Active(file):    1003484 kB
Inactive(file):   148436 kB
Unevictable:       82720 kB
Mlocked:           82720 kB
SwapTotal:        262144 kB
SwapFree:       18446744073709551588 kB
Dirty:               488 kB
Writeback:             0 kB
AnonPages:       3565108 kB
Mapped:           358736 kB
Shmem:           1121648 kB
Slab:            6454604 kB
SReclaimable:    1645480 kB
SUnreclaim:      4809124 kB
KernelStack:       23376 kB
PageTables:        62036 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    22276320 kB
Committed_AS:   11110476 kB
VmallocTotal:   34359738367 kB
VmallocUsed:     7780912 kB
VmallocChunk:   34332457992 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      333184 kB
DirectMap2M:    27969536 kB

ADD: Looking at TOP you can clearly see that the processes are not using any memory. All memory is consumed by "cached mem". This is an issue if cached memory is not freed for allocating it to processes.

Code:
.
top - 09:13:44 up 11 days, 17:57,  1 user,  load average: 2.51, 2.27, 2.16
Tasks:  16 total,   1 running,  15 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.8 us,  0.8 sy,  0.0 ni, 87.0 id, 10.3 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    655360 total,   653936 used,     1424 free,        0 buffers
KiB Swap:   262144 total,   262172 used, 42949672+free.   648868 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    1 root      20   0    5228   2940   2200 S   0.0  0.4   0:14.90 systemd
  316 root      20   0    8096   2984   2876 S   0.0  0.5   0:03.21 sshd
  333 root      20   0    2400    892    888 S   0.0  0.1   0:00.00 inetd
  342 root      20   0    3408   1408   1340 S   0.0  0.2   0:01.84 cron
  360 root      20   0    3760   1792   1640 S   0.0  0.3   0:02.27 systemd-logind
  395 root      20   0   15160   1928   1768 S   0.0  0.3   0:58.83 monit
  397 message+  20   0    5236   1820   1688 S   0.0  0.3   0:00.12 dbus-daemon
  406 root      20   0   32012   2376   1528 S   0.0  0.4   2:04.10 rsyslogd
  443 root      20   0    2424   1132   1024 S   0.0  0.2   0:00.00 agetty
  446 root      20   0    2424   1200   1092 S   0.0  0.2   0:00.00 agetty
  449 root      20   0    2424   1188   1084 S   0.0  0.2   0:00.00 agetty
 1826 root      20   0   11332   4028   3324 S   0.0  0.6   0:00.02 sshd
 1829 root      20   0    4604   2432   2112 S   0.0  0.4   0:00.00 systemd
 1837 root      20   0    6076    900      0 S   0.0  0.1   0:00.00 (sd-pam)
 1847 root      20   0    3896   2324   1928 S   0.0  0.4   0:00.00 bash
 1852 root      20   0    3520   1680   1340 R   0.0  0.3   0:00.00 top
 
I believe I am running into the same issue with a Ubuntu 16.04 LXC container (built from the Proxmox provided ubuntu-16.04-standard_16.04-1_amd64.tar.gz template) on ZFS storage. The container runs just postfix and nagios-nrpe-server and the issue occurred after ~60 days uptime.

Code:
# free -m
              total        used        free      shared  buff/cache   available
Mem:            512          19           2        1012         490           2
Swap:           512         512           0

Code:
# cat /proc/meminfo
MemTotal:         524288 kB
MemFree:            2240 kB
MemAvailable:       2240 kB
Buffers:               0 kB
Cached:           502232 kB
SwapCached:            0 kB
Active:           247972 kB
Inactive:         262044 kB
Active(anon):     240500 kB
Inactive(anon):   260532 kB
Active(file):       7472 kB
Inactive(file):     1512 kB
Unevictable:           0 kB
Mlocked:           54024 kB
SwapTotal:        524288 kB
SwapFree:              0 kB
Dirty:               176 kB
Writeback:             0 kB
AnonPages:      11997020 kB
Mapped:           439708 kB
Shmem:           1036736 kB
Slab:               0 kB
SReclaimable:          0 kB
SUnreclaim:            0 kB
KernelStack:       20176 kB
PageTables:        56724 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    82656524 kB
Committed_AS:   20897624 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:   8589312 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:     2962852 kB
DirectMap2M:    148019200 kB

Code:
top - 16:15:15 up 60 days, 14:21,  1 user,  load average: 2.47, 1.67, 1.37
Tasks:  18 total,   1 running,  17 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.4 us,  1.7 sy,  0.0 ni, 95.6 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
KiB Mem :   524288 total,     1880 free,    20208 used,   502200 buff/cache
KiB Swap:   524288 total,        0 free,   524288 used.     1880 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                               
30628 root      20   0   35276   4356   4084 S   0.0  0.8   0:00.04 systemd-journal                                                                                                                       
    1 root      20   0   37340   2428   1372 S   0.0  0.5   0:20.36 systemd                                                                                                                               
30716 root      20   0   38756   2240   1712 R   0.0  0.4   0:00.00 top                                                                                                                                   
30437 root      20   0   18528   2032   1264 S   0.0  0.4   0:00.05 bash                                                                                                                                 
30653 nagios    20   0   23844   1760   1260 S   0.0  0.3   0:00.00 nrpe                                                                                                                                 
30412 support   20   0   19140   1388      8 S   0.0  0.3   0:00.04 bash                                                                                                                                 
30411 support   20   0   95496   1020      0 S   0.0  0.2   0:00.13 sshd                                                                                                                                 
30390 root      20   0   95496    968     32 S   0.0  0.2   0:00.01 sshd                                                                                                                                 
  193 root      20   0   65520    744    640 S   0.0  0.1   0:08.36 sshd                                                                                                                                 
   65 message+  20   0   42992    588    388 S   0.0  0.1   0:00.24 dbus-daemon                                                                                                                           
30435 root      20   0   50004    580     20 S   0.0  0.1   0:00.00 sudo                                                                                                                                 
30630 syslog    20   0  256420    508     28 S   0.0  0.1   0:00.03 rsyslogd                                                                                                                             
   72 root      20   0   28548    464    312 S   0.0  0.1   0:04.37 systemd-logind                                                                                                                       
30436 root      20   0   49344    404      0 S   0.0  0.1   0:00.00 su                                                                                                                                   
   75 root      20   0   26232    148     72 S   0.0  0.0   0:07.86 cron                                                                                                                                 
  198 root      20   0   12844      0      0 S   0.0  0.0   0:00.00 agetty                                                                                                                               
  199 root      20   0   12844      0      0 S   0.0  0.0   0:00.00 agetty                                                                                                                               
  200 root      20   0   12844      0      0 S   0.0  0.0   0:00.00 agetty

Code:
$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  37340  1168 ?        Ss   Jul16   0:20 /sbin/init
message+    65  0.0  0.0  42992   188 ?        Ss   Jul16   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root        72  0.0  0.0  28548   128 ?        Ss   Jul16   0:04 /lib/systemd/systemd-logind
root        75  0.0  0.0  26232    76 ?        Ss   Jul16   0:07 /usr/sbin/cron -f
root       193  0.0  0.0  65520   304 ?        Ss   Jul16   0:08 /usr/sbin/sshd -D
root       198  0.0  0.0  12844     0 pts/1    Ss+  Jul16   0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt220
root       199  0.0  0.0  12844     0 pts/0    Ss+  Jul16   0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt220
root       200  0.0  0.0  12844     0 pts/1    Ss+  Jul16   0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt220
root     30381  0.0  0.8  35276  4456 ?        Ss   15:57   0:00 /lib/systemd/systemd-journald
syslog   30382  0.0  0.0 256420   424 ?        Ssl  15:57   0:00 /usr/sbin/rsyslogd -n
root     30390  0.0  0.1  95496   968 ?        Ss   16:04   0:00 sshd: support [priv]
support  30392  0.0  0.1  44984   532 ?        Ss   16:04   0:00 /lib/systemd/systemd --user
support  30393  0.0  0.2  60792  1232 ?        S    16:04   0:00 (sd-pam)
support  30411  0.0  0.1  95496   992 ?        S    16:04   0:00 sshd: support@pts/2
support  30412  0.0  0.5  19140  2812 pts/2    Ss   16:04   0:00 -bash
support  30433  0.0  0.3  34424  1756 pts/2    R+   16:05   0:00 ps aux

Code:
# pveversion -v
proxmox-ve: 5.0-21 (running kernel: 4.10.15-1-pve)
pve-manager: 5.0-31 (running version: 5.0-31/27769b1f)
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.15-1-pve: 4.10.15-15
pve-kernel-4.10.17-3-pve: 4.10.17-21
pve-kernel-4.10.11-1-pve: 4.10.11-9
pve-kernel-4.10.17-1-pve: 4.10.17-18
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve3
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-12
qemu-server: 5.0-15
pve-firmware: 2.0-2
libpve-common-perl: 5.0-16
libpve-guest-common-perl: 2.0-11
libpve-access-control: 5.0-6
libpve-storage-perl: 5.0-14
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-9
pve-qemu-kvm: 2.9.0-5
pve-container: 2.0-15
pve-firewall: 3.0-2
pve-ha-manager: 2.0-2
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.8-3
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.11-pve17~bpo90

Code:
# cat /var/lib/lxc/105/config
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.monitor.unshare = 1
lxc.tty = 2
lxc.environment = TERM=linux
lxc.utsname = prod-smtp-01
lxc.cgroup.memory.limit_in_bytes = 536870912
lxc.cgroup.memory.memsw.limit_in_bytes = 1073741824
lxc.cgroup.cpu.shares = 1024
lxc.rootfs = /var/lib/lxc/105/rootfs
lxc.network.type = veth
lxc.network.veth.pair = veth105i0
lxc.network.hwaddr = F6:F8:56:xx:xx:xx
lxc.network.name = eth0
lxc.cgroup.cpuset.cpus = 14

Please let me know if there is any other information I can provide. I have since rebooted the container and things are back to normal again
 
We might be suffering from the same, and this might be also related:
https://bugzilla.proxmox.com/show_bug.cgi?id=1597
Interesting...I never thought to run a 'df' when the container was exhibiting the problem to check tmpfs usage. I've since increased the memory limit of the container from 512MB to 1GB and so far it hasn't seemed to run out of memory, although maybe it will just take twice as long to happen.
 
Interesting...I never thought to run a 'df' when the container was exhibiting the problem to check tmpfs usage. I've since increased the memory limit of the container from 512MB to 1GB and so far it hasn't seemed to run out of memory, although maybe it will just take twice as long to happen.

Yep I think @mailinglists figured it out. The issue happened with one of my containers with a small RAM allocation (512M RAM, 512M swap) again. The /run tmpfs mount was using 945M and swap was nearly full:
Code:
tmpfs                          71G  945M   70G   2% /run
Code:
Swap:           512         494          17
'systemd-journal' was using the most memory. I set the following in '/etc/systemd/journald.conf':
Code:
RuntimeMaxFileSize=10M
RuntimeMaxFiles=5
And ran:
Code:
systemctl kill --kill-who=main --signal=SIGUSR2 systemd-journald.service
to force a log file rotation.

So we'll see if that resolves it.
 
Last edited:
Also using the Ubuntu 16.04 template and same issue but tempfs doesn't seems full:

Code:
Filesystem                    Size  Used Avail Use% Mounted on
rpool/data/subvol-121-disk-1   10G  1.1G  9.0G  11% /
none                          492K     0  492K   0% /dev
tmpfs                          63G     0   63G   0% /dev/shm
tmpfs                          63G  785M   63G   2% /run
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                          63G     0   63G   0% /sys/fs/cgroup
tmpfs                          13G     0   13G   0% /run/user/1005
tmpfs                          13G     0   13G   0% /run/user/1000

Code:
              total        used        free      shared  buff/cache   available
Mem:           1024         209          21       17416         792          21
Swap:             0           0           0

Here's lxc-info from the host:
Code:
Name:           121
State:          RUNNING
PID:            9240
IP:             192.168.1.240
CPU use:        23594.28 seconds
BlkIO use:      0 bytes
Memory use:     1017.24 MiB
KMem use:       27.16 MiB
Link:           veth121i0
 TX bytes:      506.83 MiB
 RX bytes:      28.17 GiB
 Total bytes:   28.66 GiB

webui report 20% ram usage. Restarting the containers is the only temporary fix as it will fill up again and the OOM will be called
 
Also using the Ubuntu 16.04 template and same issue but tempfs doesn't seems full:

Code:
Filesystem                    Size  Used Avail Use% Mounted on
tmpfs                          63G  785M   63G   2% /run

Code:
              total        used        free      shared  buff/cache   available
Mem:           1024         209          21       17416         792          21
Swap:             0           0           0

tmpfs is not full but using 785MB of your 1024MB RAM, which is a lot. Try deleting files under '/run/log/journal/$UID/' (you can leave the newest file named 'system.journal') and see if the 'available' number of RAM megabytes increases.
 
tmpfs is not full but using 785MB of your 1024MB RAM, which is a lot. Try deleting files under '/run/log/journal/$UID/' (you can leave the newest file named 'system.journal') and see if the 'available' number of RAM megabytes increases.

You are right. It did make a good difference.

Before:
Code:
             total        used        free      shared  buff/cache   available
Mem:           2048        1057           6       16652         983           6
Swap:             0           0           0

Filesystem                    Size  Used Avail Use% Mounted on
rpool/data/subvol-119-disk-1  8.0G  1.9G  6.2G  24% /
none                          492K     0  492K   0% /dev
tmpfs                          63G     0   63G   0% /dev/shm
tmpfs                          63G  885M   63G   2% /run
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                          63G     0   63G   0% /sys/fs/cgroup
tmpfs                          13G     0   13G   0% /run/user/1001
tmpfs                          13G     0   13G   0% /run/user/1005i

After:
Code:
              total        used        free      shared  buff/cache   available
Mem:           2048        1056         775       15882         215         775
Swap:             0           0           0

Filesystem                    Size  Used Avail Use% Mounted on
rpool/data/subvol-119-disk-1  8.0G  1.9G  6.2G  24% /
none                          492K     0  492K   0% /dev
tmpfs                          63G     0   63G   0% /dev/shm
tmpfs                          63G  117M   63G   1% /run
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                          63G     0   63G   0% /sys/fs/cgroup
tmpfs                          13G     0   13G   0% /run/user/1001
tmpfs                          13G     0   13G   0% /run/user/1005

So I'm assuming the workaround you posted in your other post would work then as it would prevent journald from caching too much ?
 
So I'm assuming the workaround you posted in your other post would work then as it would prevent journald from caching too much ?

Correct. Setting RuntimeMaxFileSize and RuntimeMaxFiles in journald.conf (see 'man journald.conf') will restrict how much space under /run/log/journal is used by journald.

From Manual page journald.conf(5):
Code:
The options prefixed with "Runtime" apply to the journal files when stored on a volatile in-memory file system, more specifically /run/log/journal.

Your journal logs should still be written to disk at /var/log/journal normally. I don't think it's necessary after modifying journald.conf, but you can force a journal log file rotation with:

Code:
systemctl kill --kill-who=main --signal=SIGUSR2 systemd-journald.service

I also updated bugs https://bugzilla.proxmox.com/show_bug.cgi?id=946 and https://bugzilla.proxmox.com/show_bug.cgi?id=1597 with this information in hopes it will help others.
 
journalctl(1):
Code:
--vacuum-size=, --vacuum-time=, --vacuum-files=
           Removes archived journal files until the disk space they use falls below the specified size

Which makes me think this option operates on '/var/log/journal' (on disk) rather than '/run/log/journal' (tmpfs). I need to keep my log history so I can't confirm this by testing.

This would probably work, although you might still want to manually clean up the files in '/run/log/journal':
Code:
systemctl force-reload systemd-journald
 
Which makes me think this option operates on '/var/log/journal' (on disk) rather than '/run/log/journal' (tmpfs). I need to keep my log history so I can't confirm this by testing.

You are right, it is worded so you might think it only operate on disk but after testing it does in fact cleanup /run/log/journal

So we could use the following:

First cleanup old entries:
Code:
journalctl --vacuum-size=50M

Then edit the config
Code:
vi /etc/systemd/journald.conf

At last, restart to apply the new config
Code:
systemctl restart systemd-journald
 
You are right, it is worded so you might think it only operate on disk but after testing it does in fact cleanup /run/log/journal

Ah, good to know.

At last, restart to apply the new config
Code:
systemctl restart systemd-journald

This is worth testing, but this thread says restarting the 'systemd-journald' process will result in logging stopping: https://unix.stackexchange.com/questions/379288/reloading-systemd-journald-config
I can confirm however that sending SIGUSR2 instead will not cause this issue.
 
Ah, good to know.



This is worth testing, but this thread says restarting the 'systemd-journald' process will result in logging stopping: https://unix.stackexchange.com/questions/379288/reloading-systemd-journald-config
I can confirm however that sending SIGUSR2 instead will not cause this issue.

This is weird as it worked fined for me and the logging kept working. The post mentions some changes (see below) in 2016 related to the daemon. Maybe this was included in Ubuntu 16.

The systemd people made a bodge to work around this in 2016. It involved adding the ability for processes to push arbitrary open file descriptors of their own choosing into process #1, and pull them later. systemd-journald does this with the server-end connections to the clients that it is logging, so that they remain open over a restart of the logging dæmon itself.
 
re: vi /etc/systemd/journald.conf ,

we are starting out with these settings. could you tell me what changes you've made ?
Code:
RuntimeMaxFileSize=50
RuntimeMaxFiles=100

PS: solving the lxc memory thing is huge.
 
more info on our systems

pve host as 126GB ram, 60GB swap and weekly pve memory graph shows there has never been more then 50% mem usage.

a cron check checks for oom on all linux systems
Code:
egrep -e oom-killer /var/log/kern.log

when there is a match i get an email

here are todays reports:
Code:
Feb 14 05:15:05 etherpad kernel: [2117716.125145] zabbix_agentd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL),
nodemask=(null),  order=0, oom_score_adj=0
Feb 14 05:50:58 etherpad kernel: [2119869.396444] hpcups invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),
order=0, oom_score_adj=0
 ***
               
        out of memory occurred. look into to why and try to prevent future occurrences.
 
 here /var/log/kern.log :
Feb 14 05:00:04 etherpad kernel: [2116814.939386] EXT4-fs (rbd10): mounted filesystem without journal. Opts: noload
Feb 14 05:00:08 etherpad kernel: [2116819.663722] EXT4-fs error (device rbd10): ext4_lookup:1580: inode #1056080: comm tar: deleted
inode referenced: 1054732
Feb 14 05:09:31 etherpad kernel: [2117381.779336] audit: type=1400 audit(1518602971.013:8255): apparmor="DENIED"
operation="file_lock" profile="lxc-container-default-cgns" pid=11787 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0
addr=none
Feb 14 05:09:40 etherpad kernel: [2117391.077220] audit: type=1400 audit(1518602980.310:8258): apparmor="DENIED"
operation="file_lock" profile="lxc-container-default-cgns" pid=13248 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0
addr=none
Feb 14 05:15:05 etherpad kernel: [2117716.125145] zabbix_agentd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL),
nodemask=(null),  order=0, oom_score_adj=0
Feb 14 05:15:05 etherpad kernel: [2117716.125170]  oom_kill_process+0x208/0x410
Feb 14 05:15:05 etherpad kernel: [2117716.125185]  ? page_fault+0x36/0x60
Feb 14 05:15:05 etherpad kernel: [2117716.125195] Task in /lxc/129/ns killed as a result of limit of /lxc/129
Feb 14 05:15:05 etherpad kernel: [2117716.125684] [10399]   102 10399     8838        1      21       3      116             0
dnsmasq
Feb 14 05:15:05 etherpad kernel: [2117716.125697] [11678]     0 11678    26620       42      22       3      100             0
monit
Feb 14 05:15:05 etherpad kernel: [2117716.125777] [12513]   112 12513    18422       20      38       3      168             0
zabbix_agentd
Feb 14 05:15:05 etherpad kernel: [2117716.125899] [26613]     0 26613    13940      196      31       3        0             0 lpd
Feb 14 05:15:05 etherpad kernel: [2117716.125906] [26685]     7 26685     1085       23       6       3        0             0 sh
..
Feb 14 05:50:58 etherpad kernel: [2119869.396444] hpcups invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),
order=0, oom_score_adj=0
Feb 14 05:50:58 etherpad kernel: [2119869.396462]  dump_stack+0x63/0x8b
Feb 14 05:50:58 etherpad kernel: [2119869.396479]  mm_fault_error+0x8f/0x190
Feb 14 05:50:58 etherpad kernel: [2119869.396491] RBP: 00007ffdd75cc930 R08: 00007fe651a5dfc8 R09: 00007fe651a61a60
Feb 14 05:50:58 etherpad kernel: [2119869.396517] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Feb 14 05:50:58 etherpad kernel: [2119869.396979] [10704]   100 10704     9571       23      23       3      115             0 qmgr
Feb 14 05:50:58 etherpad kernel: [2119869.397056] [12509]   112 12509    18422       69      38       3      135             0
zabbix_agentd
Feb 14 05:50:58 etherpad kernel: [2119869.397124] [13338]     7 13338    55285    14024     112       4    25430             0 dbus
Feb 14 05:50:58 etherpad kernel: [2119869.397187] [27441]     7 27441     4614      252      15       3        0             0
brother_lpdwrap
Feb 14 05:50:58 etherpad kernel: [2119869.397198] [27517]     7 27517     1581       23       6       2        0             0
rawtobr3
Feb 14 05:50:58 etherpad kernel: [2119869.398433] Killed process 11308 (dbus) total-vm:341392kB, anon-rss:56140kB, file-rss:0kB,
shmem-rss:0kB

               
                ***
               
 
 here /var/log/kern.log :
Feb 14 05:00:04 etherpad kernel: [2116814.939386] EXT4-fs (rbd10): mounted filesystem without journal. Opts: noload
Feb 14 05:00:08 etherpad kernel: [2116819.663722] EXT4-fs error (device rbd10): ext4_lookup:1580: inode #1056080: comm tar: deleted inode referenced: 1054732
Feb 14 05:09:31 etherpad kernel: [2117381.779336] audit: type=1400 audit(1518602971.013:8255): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=11787 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:09:40 etherpad kernel: [2117391.077220] audit: type=1400 audit(1518602980.310:8258): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=13248 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:15:05 etherpad kernel: [2117716.125145] zabbix_agentd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=0
Feb 14 05:15:05 etherpad kernel: [2117716.125170]  oom_kill_process+0x208/0x410
Feb 14 05:15:05 etherpad kernel: [2117716.125185]  ? page_fault+0x36/0x60
Feb 14 05:15:05 etherpad kernel: [2117716.125195] Task in /lxc/129/ns killed as a result of limit of /lxc/129
Feb 14 05:15:05 etherpad kernel: [2117716.125684] [10399]   102 10399     8838        1      21       3      116             0 dnsmasq
Feb 14 05:15:05 etherpad kernel: [2117716.125697] [11678]     0 11678    26620       42      22       3      100             0 monit
Feb 14 05:15:05 etherpad kernel: [2117716.125777] [12513]   112 12513    18422       20      38       3      168             0 zabbix_agentd
Feb 14 05:15:05 etherpad kernel: [2117716.125899] [26613]     0 26613    13940      196      31       3        0             0 lpd
Feb 14 05:15:05 etherpad kernel: [2117716.125906] [26685]     7 26685     1085       23       6       3        0             0 sh
Feb 14 05:39:31 etherpad kernel: [2119181.832563] audit: type=1400 audit(1518604771.013:8263): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=32440 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:39:40 etherpad kernel: [2119191.129808] audit: type=1400 audit(1518604780.310:8266): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=1876 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:50:58 etherpad kernel: [2119869.396444] hpcups invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=0
Feb 14 05:50:58 etherpad kernel: [2119869.396462]  dump_stack+0x63/0x8b
Feb 14 05:50:58 etherpad kernel: [2119869.396479]  mm_fault_error+0x8f/0x190
Feb 14 05:50:58 etherpad kernel: [2119869.396491] RBP: 00007ffdd75cc930 R08: 00007fe651a5dfc8 R09: 00007fe651a61a60
Feb 14 05:50:58 etherpad kernel: [2119869.396517] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Feb 14 05:50:58 etherpad kernel: [2119869.396979] [10704]   100 10704     9571       23      23       3      115             0 qmgr
Feb 14 05:50:58 etherpad kernel: [2119869.397056] [12509]   112 12509    18422       69      38       3      135             0 zabbix_agentd
Feb 14 05:50:58 etherpad kernel: [2119869.397124] [13338]     7 13338    55285    14024     112       4    25430             0 dbus
Feb 14 05:50:58 etherpad kernel: [2119869.397187] [27441]     7 27441     4614      252      15       3        0             0 brother_lpdwrap
Feb 14 05:50:58 etherpad kernel: [2119869.397198] [27517]     7 27517     1581       23       6       2        0             0 rawtobr3
Feb 14 05:50:58 etherpad kernel: [2119869.398433] Killed process 11308 (dbus) total-vm:341392kB, anon-rss:56140kB, file-rss:0kB, shmem-rss:0kB


imap
Feb 14 06:28:21 imap kernel: [2122112.359857] http invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0,
oom_score_adj=0
 ***
               
        out of memory occurred. look into to why and try to prevent future occurrances.  
               
                ***
               
 here /var/log/kern.log :
Feb 14 05:00:08 imap kernel: [2116819.666410] Buffer I/O error on dev rbd10, logical block 0, lost sync page write
Feb 14 05:09:31 imap kernel: [2117381.779333] audit: type=1400 audit(1518602971.013:8254): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=11787 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:09:40 imap kernel: [2117391.077222] audit: type=1400 audit(1518602980.310:8259): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=13248 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:15:05 imap kernel: [2117716.125154] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.2 01/16/2015
Feb 14 05:15:05 imap kernel: [2117716.125175]  mem_cgroup_oom_synchronize+0x31e/0x340
Feb 14 05:15:05 imap kernel: [2117716.125189] RSP: 002b:00007ffe610b3fd8 EFLAGS: 00010246
Feb 14 05:15:05 imap kernel: [2117716.125202] kmem: usage 18316kB, limit 9007199254740988kB, failcnt 0
Feb 14 05:15:05 imap kernel: [2117716.125688] [10467]     0 10467    17962       31      39       3      192             0 cups-browsed
Feb 14 05:15:05 imap kernel: [2117716.125769] [12444]     0 12444    52214    11963      72       3      791             0 cupsd
Feb 14 05:15:05 imap kernel: [2117716.125833] [11308]     7 11308    78616     7279     157       3    55488             0 dbus
Feb 14 05:15:05 imap kernel: [2117716.125867] [ 1148]   100  1148     9559       18      24       3      117             0 pickup
Feb 14 05:15:05 imap kernel: [2117716.127204] Killed process 8613 (dbus) total-vm:337432kB, anon-rss:29112kB, file-rss:0kB, shmem-rss:0kB
Feb 14 05:39:40 imap kernel: [2119191.129810] audit: type=1400 audit(1518604780.310:8267): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=1876 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 05:50:58 imap kernel: [2119869.396453] CPU: 2 PID: 27523 Comm: hpcups Tainted: P           O    4.13.13-5-pve #1
Feb 14 05:50:58 imap kernel: [2119869.396468]  ? mem_cgroup_scan_tasks+0xc4/0xf0
Feb 14 05:50:58 imap kernel: [2119869.396481]  do_page_fault+0x22/0x30
Feb 14 05:50:58 imap kernel: [2119869.396492] R13: 00007fe651a40000 R14: 0000000000000000 R15: 000000000000000e
Feb 14 05:50:58 imap kernel: [2119869.396969] [10297]     0 10297    64669       70      28       4      111             0 rsyslogd
Feb 14 05:50:58 imap kernel: [2119869.396981] [10741]     0 10741     7068       22      20       3       50             0 systemd-logind
Feb 14 05:50:58 imap kernel: [2119869.397058] [12512]   112 12512    18422       68      38       3      136             0 zabbix_agentd
Feb 14 05:50:58 imap kernel: [2119869.397134] [15430]     7 15430    42415    14023      87       3    12548             0 dbus
Feb 14 05:50:58 imap kernel: [2119869.397189] [27499]     7 27499    26687     2058      54       3        0             0 gs
Feb 14 05:50:58 imap kernel: [2119869.397201] [27522]     7 27522    17783      116      36       3        0             0 gstoraster
Feb 14 06:09:40 imap kernel: [2120991.182627] audit: type=1400 audit(1518606580.310:8275): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=4585 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none
Feb 14 06:11:20 imap kernel: [2121091.700137] audit: type=1400 audit(1518606680.825:8277): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-container-default-cgns"
name="/" pid=25179 comm="(pachectl)" flags="rw, rslave"
Feb 14 06:28:21 imap kernel: [2122112.359857] http invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=0
Feb 14 06:28:21 imap kernel: [2122112.359884]  oom_kill_process+0x208/0x410
Feb 14 06:28:21 imap kernel: [2122112.359896]  do_page_fault+0x22/0x30
Feb 14 06:28:21 imap kernel: [2122112.359907] R13: cccccccccccccccd R14: 00007ffdcc5c2680 R15: 0000560c701c76f0
Feb 14 06:28:21 imap kernel: [2122112.360408] [10297]     0 10297    64669       70      28       4      111             0 rsyslogd
Feb 14 06:28:21 imap kernel: [2122112.360420] [10707]   110 10707    76901      132      52       3     1059             0 colord
Feb 14 06:28:21 imap kernel: [2122112.360490] [12510]   112 12510    18422       67      38       3      139             0 zabbix_agentd
Feb 14 06:28:21 imap kernel: [2122112.360551] [15430]     7 15430    49147    20740     100       3    12548             0 dbus
Feb 14 06:28:21 imap kernel: [2122112.360610] [17001]   100 17001     9559      135      24       3        0             0 pickup
Feb 14 06:28:21 imap kernel: [2122112.360657] [23472]     0 23472     8287      100      21       3        0             0 http
Feb 14 06:28:21 imap kernel: [2122112.360668] [23513]     7 23513    27475     2360      58       3        0             0 gs
Feb 14 09:09:40 imap kernel: [2131791.498624] audit: type=1400 audit(1518617380.310:8325): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=29981 comm="(ionclean)"
family="unix" sock_type="dgram" protocol=0 addr=none

that is grepping kernel.log inside the lxc .
those logs seem to be recording log lines from the pve host.