I am having stability issues with LXC containers after migration from OpenVZ
What happens is that when the memory is all used in a container, the OOM kicks in and kills processes (see attached example).
If I try to restart a killed process it will usually fail. A reboot of the container is necessary (as if there is a memory leak somewhere and memory is never freed).
This is a huge issue.
What happens is that when the memory is all used in a container, the OOM kicks in and kills processes (see attached example).
If I try to restart a killed process it will usually fail. A reboot of the container is necessary (as if there is a memory leak somewhere and memory is never freed).
This is a huge issue.
Code:
[1878424.274466] df_inode invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[1878424.274471] df_inode cpuset=15004 mems_allowed=0-1
[1878424.274477] CPU: 6 PID: 15634 Comm: df_inode Tainted: P O 4.2.6-1-pve #1
[1878424.274479] Hardware name: Supermicro X8DTT-H/X8DTT-H, BIOS 2.1b 10/28/2011
[1878424.274481] 0000000000000000 000000001156d152 ffff880060a2fcb8 ffffffff818013d8
[1878424.274483] 0000000000000000 ffff8806c18c5780 ffff880060a2fd38 ffffffff817ff8fa
[1878424.274485] ffff880328ce4378 ffff8806e685bc30 0000000000000015 0000000000000000
[1878424.274488] Call Trace:
[1878424.274498] [<ffffffff818013d8>] dump_stack+0x45/0x57
[1878424.274500] [<ffffffff817ff8fa>] dump_header+0xaf/0x238
[1878424.274505] [<ffffffff81185713>] oom_kill_process+0x1e3/0x3c0
[1878424.274509] [<ffffffff811f09f1>] mem_cgroup_oom_synchronize+0x531/0x600
[1878424.274515] [<ffffffff811ecbe0>] ? mem_cgroup_css_online+0x250/0x250
[1878424.274517] [<ffffffff81185ee3>] pagefault_out_of_memory+0x13/0x80
[1878424.274522] [<ffffffff8106735f>] mm_fault_error+0x7f/0x160
[1878424.274524] [<ffffffff81067823>] __do_page_fault+0x3e3/0x410
[1878424.274527] [<ffffffff81067872>] do_page_fault+0x22/0x30
[1878424.274530] [<ffffffff8180a2c8>] page_fault+0x28/0x30
[1878424.274532] Task in /lxc/15004 killed as a result of limit of /lxc/15004
[1878424.274536] memory: usage 655360kB, limit 655360kB, failcnt 28963
[1878424.274537] memory+swap: usage 917424kB, limit 917504kB, failcnt 646
[1878424.274538] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[1878424.274539] Memory cgroup stats for /lxc/15004: cache:568488KB rss:86872KB rss_huge:0KB mapped_file:24756KB dirty:0KB writeback:0KB swap:262064KB inactive_anon:354640KB active_anon:300608KB inactive_file:0KB active_file:0KB unevictable:0KB
[1878424.274551] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[1878424.274724] [13751] 0 13751 1311 704 5 2 61 0 systemd
[1878424.274726] [14338] 0 14338 2024 746 7 2 89 -1000 sshd
[1878424.274728] [14371] 0 14371 600 223 5 2 23 0 inetd
[1878424.274730] [14388] 0 14388 852 352 5 2 39 0 cron
[1878424.274732] [14421] 0 14421 940 437 5 2 33 0 systemd-logind
[1878424.274734] [14486] 0 14486 3790 481 7 2 57 0 monit
[1878424.274736] [14488] 105 14488 1309 427 6 2 75 -900 dbus-daemon
[1878424.274738] [14505] 0 14505 8003 594 11 2 87 0 rsyslogd
[1878424.274740] [14573] 0 14573 606 283 5 2 2 0 agetty
[1878424.274742] [14578] 0 14578 606 300 5 2 1 0 agetty
[1878424.274744] [14583] 0 14583 606 297 5 2 1 0 agetty
[1878424.274746] [14596] 0 14596 3041 1966 11 2 404 0 munin-node
[1878424.274794] [ 8519] 0 8519 13326 10593 28 2 0 0 systemd-journal
[1878424.274797] [10977] 104 10977 61849 19992 72 2 0 0 named
[1878424.274866] [15564] 0 15564 3041 1894 11 2 390 0 /usr/sbin/munin
[1878424.274871] [15634] 65534 15634 1487 990 7 2 0 0 df_inode
[1878424.274873] Memory cgroup out of memory: Kill process 10977 (named) score 87 or sacrifice child
[1878424.274954] Killed process 10977 (named) total-vm:247396kB, anon-rss:75876kB, file-rss:4092kB