rgmanager running per cli but not pve

What is the output of

# fence_tool ls

I observer exactly that behavior when fencing is in action, and fencing agent returns errors

# grep fencing /var/log/syslog
 
Last edited:
here are results from 2 of the nodes:
Code:
fbc240 s009 ~ # fence_tool ls
fence domain
member count  4
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 3 4 5 

fbc240 s009 ~ # grep fencing /var/log/syslog 
fbc240 s009 ~ # 


fbc240 s009 ~ # fence_tool ls
fence domain
member count  4
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 3 4 5 

fbc240 s009 ~ # grep fencing /var/log/syslog 
fbc240 s009 ~ #
 
Hi Dietmar
Does it look like the updated rgmanager will get to pvetest sometime in the next week or so? If not I'll try shutting down all the hosts and restart to see if rgmanager works in pve.

I know there may be other priorities .
thanks
Rob
 
this morning I tried turning off all nodes, then starting one at a time.

that did not fix the issue.

rgmanager is running per cli, but not per pve > datacenter > summary .

any suggestions to fix this so we can have high availability kvm's ?
 

Attachments

  • Proxmox Virtual Environment 2012-06-25 11-10-48.png
    Proxmox Virtual Environment 2012-06-25 11-10-48.png
    18.7 KB · Views: 14
I added

expected_votes="2"

to cluster.conf and activated.

rgmanager is running per cli and the services tab for each node.

but in data center summary it is not listed.

It would be nice to get HA cluster working. We have drbd but can not use heartbeat in version 2 . Heartbeat and drbd were our high available set up
prior to this for 4 years..

I'll start another thread regarding manual switch over.
 
I added

expected_votes="2"

to cluster.conf and activated.

No, please remove that. You completely miss-understood me.

I you cold start a cluster, and you start one node after another, you clearly do not have quorum until there are enough nodes started.
So fencing and rgmanager does not start on those nodes.

So after you get quorum, you need to start those services on all nodes manually:

# /etc/init.d/cman/start
# /etc/init.d/rgmanager start

To get quorum you can also set the expected votes temporarily:

# pvecm expected X

hope that is more clear now.
 
Dietmar
that is very clear thank you. I'll remove the line now .

On the weekend will try turning off all nodes , then starting 1-st the main node and doing as you suggest to try to get rgmanager working properly.
 
Hi,
i have the same problem, deadlock rgmanager when the nodes lost connection between them.
I have only found a patch for redhat system but not for debian.
when there will be a patch for debian ?
Best Regards
Giuseppe
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!