(Very) weird RAM problem...

dosicalu

New Member
Oct 6, 2011
5
0
1
Hi,

We've a big problem with our proxmox (1.8) installation.

We have (only) 8 active VMs on a 48GB RAM server.

Proxmox (OOM process) kills many processes on (especially) one of the VMs (our LDAP server for 7300 users on the campus !!!)...

grep selection...

Oct 6 10:47:37 proxmox1 kernel: OOM killed process apache2 (pid=18258, ve=0) exited, free=988536 gen=310.
Oct 6 10:47:37 proxmox1 kernel: OOM killed process apache2 (pid=18103, ve=6989) exited, free=989032 gen=311.
Oct 6 14:18:46 proxmox1 kernel: OOM killed process init-logger (pid=20237, ve=6939) exited, free=828057 gen=312.
Oct 6 14:18:46 proxmox1 kernel: OOM killed process portmap (pid=20424, ve=6939) exited, free=828026 gen=313.
Oct 6 14:18:47 proxmox1 kernel: OOM killed process rpc.statd (pid=20430, ve=6939) exited, free=827906 gen=314.
Oct 6 14:18:47 proxmox1 kernel: OOM killed process rsyslogd (pid=20538, ve=6939) exited, free=827906 gen=315.
Oct 6 14:18:48 proxmox1 kernel: OOM killed process slapd (pid=20588, ve=6939) exited, free=836371 gen=316.
Oct 6 14:18:48 proxmox1 kernel: OOM killed process apache2 (pid=31049, ve=0) exited, free=836743 gen=317.
Oct 6 14:18:48 proxmox1 kernel: OOM killed process apache2 (pid=30900, ve=6989) exited, free=840556 gen=318.
Oct 6 16:07:05 proxmox1 kernel: OOM killed process slapd (pid=2435, ve=6939) exited, free=842914 gen=319.
Oct 6 16:07:05 proxmox1 kernel: OOM killed process apache2 (pid=2698, ve=0) exited, free=846766 gen=320.

When we look at the web UI, the memory used by the "biggest" VM is only 500 Mo... So, it seems we don't have memory problem...

But, on the proxmox host, the result of the command "cat /proc/meminfo" is :

proxmox1:/var/log# cat /proc/meminfo
MemTotal: 49357864 kB
MemFree: 3397444 kB
Buffers: 4156648 kB
Cached: 39305536 kB
SwapCached: 0 kB
Active: 19845908 kB
Inactive: 24096256 kB
Active(anon): 342624 kB
Inactive(anon): 140004 kB
Active(file): 19503284 kB
Inactive(file): 23956252 kB
Unevictable: 2584 kB
Mlocked: 2584 kB
SwapTotal: 49283064 kB
SwapFree: 49283064 kB
Dirty: 148 kB
Writeback: 0 kB
AnonPages: 481696 kB
Mapped: 87624 kB
Shmem: 1980 kB
Slab: 1674304 kB
SReclaimable: 1552236 kB
SUnreclaim: 122068 kB
KernelStack: 3920 kB
PageTables: 26096 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 73961996 kB
Committed_AS: 4 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 551572 kB
VmallocChunk: 34333259272 kB
HardwareCorrupted: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 4580 kB
DirectMap2M: 2082816 kB
DirectMap1G: 48234496 kB

We wonder about the values in bold font !!!
What's the meaning of the 39GB of the "cached" line... Is it an usual value ???
We have only 3GB free memory !!! Weird, isn't it ?

If the problem was RAM-related, the swap should be used ???
But :

proxmox1:/var/log# free
total used free shared buffers cached
Mem: 49357864 45976724 3381140 0 4156648 39305888
-/+ buffers/cache: 2514188 46843676
Swap: 49283064 0 49283064

Any ideas... please... ? :(
 
Last edited:
do you run KVM or OpenVZ? if you run OpenVZ, check failcounts (user beancounters).
 
We run OpenVZ VMs, with 4GB RAM each.
failcounts :
proxmox1:/var/log# cat /proc/user_beancounters | grep -v 0$
Version: 2.5
uid resource held maxheld barrier limit failcnt
6973: kmemsize 5362180 14376974 14372700 14790164 28338
oomguarpages 15160 17850 3145728 9223372036854775807 5
6994: kmemsize 7546159 14372536 14372700 14790164 16
oomguarpages 21012 31325 26112 9223372036854775807 5
6989: kmemsize 8420326 18881210 18874368 20971520 15276
oomguarpages 11901 564475 4194304 9223372036854775807 194
tcpsndbuf 286504 4203632 4194304 5242880 2221635258
oomguarpages 18377 562178 3145728 9223372036854775807 38
oomguarpages 26779 145714 9223372036854775807 9223372036854775807 81
 
The oom killer kill processes inside you container - so you need to assign more RAM to those containers.
 
ok, we set 8Go instead of 4Go, but we don't understand
- why the WEB UI shows us that the VM only uses 200Mo;
- the value of MemFree remains the same after adding 4Go;
- value of (Mem)Cached is so big.

Thanx.
 
- why the WEB UI shows us that the VM only uses 200Mo;
- the value of MemFree remains the same after adding 4Go;

maybe it is just a short spike.

- value of (Mem)Cached is so big.

linux always try to use all available memory for cache/buffers - that is not the problem.
 
With 8Go the VM is still alive !

I did what you suggested : i have added the 4 Go to the RAM and what I get when I look at the result through the proxmox GUI Memory/Swap, it seems fine (swap+ram)

ram+swap.png

I then used vzsplit to determine the right size for the swap (split -n 12, host has 48 Go of total ram). This is the result i get through the proxmox GUI applying these parameters :

strange.png

I don't understand the result : the ram that has been allowed is very low (the size of the swap looks correct). What looks incredible is the total memory available (29Go !!!)

Thanks.