Hi. I have set up what i thought it was a fair system design but is proving to be very dumb. I have two proxmox server.
One is the main server and the other is the backup one. The backup server, just stays on about three hours at night to rsync the first server.
These two servers are in a cluster, the cluster is not functional at all. I first did this to take advantages of centralized management of the virtual machines and transferring machines in one click in the webui.
However the fact that the second server is not always on, destroys the stability of the cluster. When i want to use the mentioned functionality i need to restart the cman service and the pve-cluster service in both nodes.
However the problem is not this, but recently server 1, which stays on all day, began to fai approximadlty every two weeks because of corosync, that has a memory leak and at some point it uses near 75% of the memory. It this point the users note that the server is slow, they notify me and i restart the corosync process. Everything is back to normal.
Im in my way of moving this dumb cluster setup, but in the meantime.... Corosync shouldn´t behave like that even under this circumstances.
Any idea how to debug the issue?.
Proxmox-VE with debian wheezy. Everything up to date.
One is the main server and the other is the backup one. The backup server, just stays on about three hours at night to rsync the first server.
These two servers are in a cluster, the cluster is not functional at all. I first did this to take advantages of centralized management of the virtual machines and transferring machines in one click in the webui.
However the fact that the second server is not always on, destroys the stability of the cluster. When i want to use the mentioned functionality i need to restart the cman service and the pve-cluster service in both nodes.
However the problem is not this, but recently server 1, which stays on all day, began to fai approximadlty every two weeks because of corosync, that has a memory leak and at some point it uses near 75% of the memory. It this point the users note that the server is slow, they notify me and i restart the corosync process. Everything is back to normal.
Im in my way of moving this dumb cluster setup, but in the meantime.... Corosync shouldn´t behave like that even under this circumstances.
Any idea how to debug the issue?.
Proxmox-VE with debian wheezy. Everything up to date.