I am having stability issues with LXC containers after migration from OpenVZ
What happens is that when the memory is all used in a container, the OOM kicks in and kills processes (see attached example).
If I try to restart a killed process it will usually fail. A reboot of the container is necessary (as if there is a memory leak somewhere and memory is never freed).
This is a huge issue.
	
	
	
		
				
			What happens is that when the memory is all used in a container, the OOM kicks in and kills processes (see attached example).
If I try to restart a killed process it will usually fail. A reboot of the container is necessary (as if there is a memory leak somewhere and memory is never freed).
This is a huge issue.
		Code:
	
	[1878424.274466] df_inode invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[1878424.274471] df_inode cpuset=15004 mems_allowed=0-1
[1878424.274477] CPU: 6 PID: 15634 Comm: df_inode Tainted: P           O    4.2.6-1-pve #1
[1878424.274479] Hardware name: Supermicro X8DTT-H/X8DTT-H, BIOS 2.1b       10/28/2011
[1878424.274481]  0000000000000000 000000001156d152 ffff880060a2fcb8 ffffffff818013d8
[1878424.274483]  0000000000000000 ffff8806c18c5780 ffff880060a2fd38 ffffffff817ff8fa
[1878424.274485]  ffff880328ce4378 ffff8806e685bc30 0000000000000015 0000000000000000
[1878424.274488] Call Trace:
[1878424.274498]  [<ffffffff818013d8>] dump_stack+0x45/0x57
[1878424.274500]  [<ffffffff817ff8fa>] dump_header+0xaf/0x238
[1878424.274505]  [<ffffffff81185713>] oom_kill_process+0x1e3/0x3c0
[1878424.274509]  [<ffffffff811f09f1>] mem_cgroup_oom_synchronize+0x531/0x600
[1878424.274515]  [<ffffffff811ecbe0>] ? mem_cgroup_css_online+0x250/0x250
[1878424.274517]  [<ffffffff81185ee3>] pagefault_out_of_memory+0x13/0x80
[1878424.274522]  [<ffffffff8106735f>] mm_fault_error+0x7f/0x160
[1878424.274524]  [<ffffffff81067823>] __do_page_fault+0x3e3/0x410
[1878424.274527]  [<ffffffff81067872>] do_page_fault+0x22/0x30
[1878424.274530]  [<ffffffff8180a2c8>] page_fault+0x28/0x30
[1878424.274532] Task in /lxc/15004 killed as a result of limit of /lxc/15004
[1878424.274536] memory: usage 655360kB, limit 655360kB, failcnt 28963
[1878424.274537] memory+swap: usage 917424kB, limit 917504kB, failcnt 646
[1878424.274538] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[1878424.274539] Memory cgroup stats for /lxc/15004: cache:568488KB rss:86872KB rss_huge:0KB mapped_file:24756KB dirty:0KB writeback:0KB swap:262064KB inactive_anon:354640KB active_anon:300608KB inactive_file:0KB active_file:0KB unevictable:0KB
[1878424.274551] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
[1878424.274724] [13751]     0 13751     1311      704       5       2       61             0 systemd
[1878424.274726] [14338]     0 14338     2024      746       7       2       89         -1000 sshd
[1878424.274728] [14371]     0 14371      600      223       5       2       23             0 inetd
[1878424.274730] [14388]     0 14388      852      352       5       2       39             0 cron
[1878424.274732] [14421]     0 14421      940      437       5       2       33             0 systemd-logind
[1878424.274734] [14486]     0 14486     3790      481       7       2       57             0 monit
[1878424.274736] [14488]   105 14488     1309      427       6       2       75          -900 dbus-daemon
[1878424.274738] [14505]     0 14505     8003      594      11       2       87             0 rsyslogd
[1878424.274740] [14573]     0 14573      606      283       5       2        2             0 agetty
[1878424.274742] [14578]     0 14578      606      300       5       2        1             0 agetty
[1878424.274744] [14583]     0 14583      606      297       5       2        1             0 agetty
[1878424.274746] [14596]     0 14596     3041     1966      11       2      404             0 munin-node
[1878424.274794] [ 8519]     0  8519    13326    10593      28       2        0             0 systemd-journal
[1878424.274797] [10977]   104 10977    61849    19992      72       2        0             0 named
[1878424.274866] [15564]     0 15564     3041     1894      11       2      390             0 /usr/sbin/munin
[1878424.274871] [15634] 65534 15634     1487      990       7       2        0             0 df_inode
[1878424.274873] Memory cgroup out of memory: Kill process 10977 (named) score 87 or sacrifice child
[1878424.274954] Killed process 10977 (named) total-vm:247396kB, anon-rss:75876kB, file-rss:4092kB