rgmanager running per cli but not pve

dietmar · Sep 19, 2012

e100 said:
Has bug 105 been tested to confirm the issue is fixed?
Bug 105 is not closed and the only comments on it are mine.

Maybe you can test again?

RobFantini · Sep 19, 2012

We've not seen the old issue.

rgmanager is not installed as a package on our cluster:

Code:

Package: rgmanager                       
New: yes
State: not installed
Version: 3.0.12-2
Priority: optional
Section: admin
Maintainer: Debian HA Maintainers <debian-ha-maintainers@lists.alioth.debian.org>
Uncompressed Size: 975 k
Depends: libc6 (>= 2.3.2), libccs3 (>= 3.0.12), libcman3 (>= 3.0.12), libdlm3 (>= 3.0.12), libldap-2.4-2 (>= 2.4.7),
         liblogthread3 (>= 3.0.12), libncurses5 (>= 5.7+20100313), libslang2 (>= 2.0.7-1), libxml2 (>= 2.7.4), cman (=
         3.0.12-2), iproute, iputils-arping, iputils-ping, nfs-kernel-server, nfs-common, perl, gawk, net-tools
Conflicts: nfs-user-server
Description: Red Hat cluster suite - clustered resource group manager
 This package is part of the Red Hat Cluster Suite, a complete high-availability solution. 
 
 Resource Group Manager provides high availability of critical server applications in the event of planned or
 unplanned system downtime.

yet it is running:

Code:

fbc241  ~ # ps -ef|grep rgmanager
root        2800       1  0 Aug18 ?        00:00:00 rgmanager
root        2802    2800  0 Aug18 ?        00:23:10 rgmanager

e100 · Sep 19, 2012

dietmar said:
Maybe you can test again?

Bug 105 is not fixed, the issue persists.

Before breaking network:

Code:

# clustat
Cluster Status for Inhouse @ Wed Sep 19 09:46:34 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vm1                                                                 1 Online, Local, rgmanager
 vm2                                                                 2 Online, rgmanager
 vm3                                                                 3 Online, rgmanager

 Service Name                                                     Owner (Last)                                                     State         
 ------- ----                                                     ----- ------                                                     -----         
 service:masterIP                                                 vm1                                                              started

After all nodes removed from network:

Code:

# clustat
Timed out waiting for a response from Resource Group Manager
Cluster Status for Inhouse @ Wed Sep 19 09:50:15 2012
Member Status: Inquorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vm1                                                                 1 Online, Local
 vm2                                                                 2 Offline
 vm3                                                                 3 Offline

Restore network communications, note that no rgmanager is shown:

Code:

# clustat
Timed out waiting for a response from Resource Group Manager
Cluster Status for Inhouse @ Wed Sep 19 09:52:07 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 vm1                                                                 1 Online, Local
 vm2                                                                 2 Online
 vm3                                                                 3 Online

rgmanager is still running:

Code:

# ps ax|grep rgmanager
   1965 ?        S<Ls   0:00 rgmanager
   1967 ?        S<l    0:00 rgmanager

Some nodes log kernel message after period of time:

Code:

INFO: task rgmanager:3552 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rgmanager     D ffff88043602d380     0  3552   1778    0 0x00000000
 ffff880433ea7c70 0000000000000086 00000000006315a0 ffff880433ea7bf8
 ffffffff8104976e ffff880433ea7c18 ffff8804335d73a0 00000000006315a0
 0000000000000001 ffff88043602d920 ffff880433ea7fd8 ffff880433ea7fd8
Call Trace:
 [<ffffffff8104976e>] ? flush_tlb_page+0x5e/0xa0
 [<ffffffff81155eae>] ? do_wp_page+0x4fe/0x9c0
 [<ffffffff81527cf5>] rwsem_down_failed_common+0x95/0x1d0
 [<ffffffff81527e86>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff8104f02d>] ? check_preempt_curr+0x6d/0x90
 [<ffffffff8127c4f4>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff81527374>] ? down_read+0x24/0x30
 [<ffffffff8100bd0e>] ? invalidate_interrupt0+0xe/0x20
 [<ffffffffa04775f7>] dlm_user_request+0x47/0x240 [dlm]
 [<ffffffff81180a19>] ? __kmalloc+0xf9/0x270
 [<ffffffff81180b4f>] ? __kmalloc+0x22f/0x270
 [<ffffffffa0484ed9>] device_write+0x5f9/0x7d0 [dlm]
 [<ffffffff81194b78>] vfs_write+0xb8/0x1a0
 [<ffffffff81195591>] sys_write+0x51/0x90
 [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b

Only way to get rgmanager working again is to restart each node.
rgmanager can not be stopped, restarted or killed once it gets into this state.

Very simple to reproduce, just turn off your network switch that carries the cluster traffic, wait to loose quorum and turn it back on.

dietmar · Sep 20, 2012

e100 said:
Very simple to reproduce, just turn off your network switch that carries the cluster traffic, wait to loose quorum and turn it back on.

Besides, the error scenario is strange. A HA cluster works as long as at least one partition has quorum. If you loose quorum on all nodes you are likely to run into many problems, and most times you need to manually recover the cluster. That being said, you should avoid loosing quorum on all nodes (use redundant network).

e100 · Sep 20, 2012

I agree that a redundant network is desired.

If you loose quorum on all nodes it should not cause a problem. Nothing should change, no quorum = no changes.

So if nothing changes when loosing quorum everywhere why is it impossible to regain quorum and continue working?

When the communications are restored quorum is regained.
The only issue is that rgmanager deadlocks. It is even impossible to restart rgmanager manually.
How can I manually recover when the rgmanager daemon will not let me manually stop it?

Having to reboot EVERY node in the cluster to recover from this issue is very disruptive and not much of a solution.

dietmar · Sep 20, 2012

e100 said:
If you loose quorum on all nodes it should not cause a problem. Nothing should change, no quorum = no changes.

no quorum means fencing is starting, and all service should stop to be on the safe side?

e100 · Sep 20, 2012

dietmar said:
no quorum means fencing is starting, and all service should stop to be on the safe side?

Quorum is needed to decide what nodes get fenced.
With no nodes having quorum nothing gets fenced.

dietmar · Sep 20, 2012

e100 said:
Quorum is needed to decide what nodes get fenced.
With no nodes having quorum nothing gets fenced.

You talk about the very special case when you loose quorum at 'exactly' the same time (within a view ms) one all nodes?
Besides, rgmanager stops all services when it looses quorum (see 'man rgmanager').

e100 · Sep 20, 2012

dietmar said:
You talk about the very special case when you loose quorum at 'exactly' the same time (within a view ms) one all nodes?
Besides, rgmanager stops all services when it looses quorum (see 'man rgmanager').

In this instance rgmanager does not seem to stop anything, it locks up instead.
If it should stop things when quorum is lost then that needs fixed, right now it just locks up and does not stop anything.

Also, it is not necessary to turn off the switch, that is just a simple way to trigger the issue.
Start pulling network cables one by one until you loose quorum.

Pulling two network cables out of three node cluster will result in same lock up condition, I know because I have done this.
I do not know if the same issue happens with two cables pulled from a 4 node cluster. or three from a 5 node cluster. Someone else will need to test this.

I agree redundant network switches/connections *should* be used, I already have multiple clusters setup this way already.
But with small clusters and small budgets people will use a single switch.
Someone will accidentally unplug the switch some day, or when editing some vlans mess up and bring the network down, people make mistakes and crap happens.

It would be beneficial to gracefully recover from this situation rather than having to restart every single node causing a disruption to every single VM that is running in the entire cluster.

dietmar · Sep 21, 2012

e100 said:
It would be beneficial to gracefully recover from this situation rather than having to restart every single node causing a disruption to every single VM that is running in the entire cluster.

I will try to debug that when i have some spare time.

christophe · Nov 8, 2012

Hi all,

Exactely same problem here.

Switches turned off.

rmganager not killable, not stoppable.

root@px1:~# pveversion -v
pve-manager: 2.2-26 (pve-manager/2.2/c1614c8c)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-80
pve-kernel-2.6.32-16-pve: 2.6.32-80
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-1
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-28
qemu-server: 2.0-64
pve-firmware: 1.0-21
libpve-common-perl: 1.0-37
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-34
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

Christophe.

dietmar · Nov 8, 2012

christophe said:
Switches turned off.

rmganager not killable, not stoppable.

Again, If you run HA you need to make sure that you do not loose quorum (redundant network, 2 switches).

christophe · Nov 8, 2012

Of course.
Each and every device is redundant : bonding, RAID 5 & BBU + spare, switches, redundant power supply on servers and SAN, and so on.
But both switches are behind the same UPS : bad luck!
And rgmanager is dead : i hope it is not by design!

Christophe.

dietmar · Nov 8, 2012

christophe said:
But both switches are behind the same UPS : bad luck!
And rgmanager is dead : i hope it is not by design!

Again, you need to make you network redundant.

If you loose quorum on all nodes you need to manually restart the cluster.

Search

Search

rgmanager running per cli but not pve

dietmar

Proxmox Staff Member

RobFantini

Famous Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

e100

Renowned Member

dietmar

Proxmox Staff Member

christophe

Renowned Member

dietmar

Proxmox Staff Member

christophe

Renowned Member

dietmar

Proxmox Staff Member

We value your privacy