Proxmox 3.0 Cluster corosync running system out of memory

Odie1

New Member
Jun 24, 2013
5
0
1
Hello All,


We've been using proxmox in standalone mode for awhile and recently moved to the clustered setup per various guides on website. The setup went without any problems and unified management looks good. The problem we started having after setup were due to OOM and OOM-killer randomly killing all the servers. After some digging we isolated the issue to corosync running the system out of memory. After searching on corosync lists and newsgroups nothing was found, so curious if anyone here has any tips or suggestions on dealing with this issue.


Due to us having these problems I've turned off corosync on all but 3 nodes due to OOM to continue investigating.


Distro: Wheezy


Memory consumption over 12hours:
pmox1:
USER PID %CPU %MEM VSZ RSS STAT ELAPSED COMMAND
root 218488 0.2 81.2 3412584 3273512 S<Lsl 11:34:40 corosync -f


pmox2:
USER PID %CPU %MEM VSZ RSS STAT ELAPSED COMMAND
root 204975 0.2 81.8 3437340 3298664 S<Lsl 11:39:36 corosync -f


pmox3:
USER PID %CPU %MEM VSZ RSS STAT ELAPSED COMMAND
root 358776 0.2 80.3 3435464 3296348 S<Lsl 11:38:54 corosync -f






Proxmox Version:
pve-manager: 3.0-23 (pve-manager/3.0/957f0862)
running kernel: 2.6.32-20-pve
proxmox-ve-2.6.32: 3.0-100
pve-kernel-2.6.32-20-pve: 2.6.32-100
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-20
pve-firmware: 1.0-22
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-8
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-13
ksm-control-daemon: 1.1-1






cluster.conf:
<?xml version="1.0"?>
<cluster config_version="18" name="clrdev">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<clusternodes>
<clusternode name="int-proxmox2" nodeid="1" votes="1"/>
<clusternode name="int-proxmox1" nodeid="2" votes="1"/>
<clusternode name="proxmox4" nodeid="3" votes="1"/>
<clusternode name="proxmox3" nodeid="4" votes="1"/>
<clusternode name="proxmox7" nodeid="5" votes="1"/>
<clusternode name="proxmox6" nodeid="6" votes="1"/>
</clusternodes>
<rm/>
</cluster>




Partial cluster/daemon.log
http://pastebin.com/zf13srf5

Thanks,

Omar
 
no double posts please (some posts, especially from new users are moderated).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!