Urgent Help Needed

ejc317

Member
Oct 18, 2012
263
0
16
So we tried to remove a node from the cluster. This is node 1 out of 12.

After it removed, we restarted the cluster. Now there is no quorum. Node 1 sees itself with cman and rgmanager start. Nodes 2 - 12 cannot start cman and there's no data (all the qemu-server folders are empty!!!)

I checked our SAN and the storage data is still there meaning the confs have been erased (not as bad)

I tried to start cman and I get corosync daemon cannot be started on nodes 2-12 (node 1 starts)

I can manually edit custer.conf on node 1 but that's it

any help is appreciated ...
 
All the files are still read only ...

when I try to put it into local mode it says cannot get lock (pmxcfs -l). I tried unmount and remounting also doesn't work

HELP!!!!!!!
 
I have faced the same issue last week and I know how utterly frustrating it is. All I did was safely remove a node from a cluster of 4 nodes using pvecm delnode. That worked but the I noticed that all my nodes had lost quorum, and couldn't figure it out for the life of me. Now what I did notice was that when I accessed each node individually they showed up as green with the remaining nodes showing as red. I should have thoroughly documented this case and reported it to Tom, but haven't had the time yet. Anyway, here are some of the items that I tried while trying to bring back each node:


pvecm status (take note of the expected votes)
pvecm e 1 (Set votes to 1)
pvecm status (check to see if that worked)



/etc/init.d/pve-cluster stop

/usr/bin/pmxcfs -l

mv /etc/pve/cluster.conf ~/

/etc/init.d/pve-cluster stop

/etc/init.d/cman stop

/etc/init.d/cman start

/etc/init.d/pve-cluster start

Check your logs:

tail -f /var/log/syslog
tail -f /var/log/messages

Check Cluster Status:
pvecm nodes
pvecm status

On the node we just ran all this on, check to see if your member has been populated:

cat /etc/pve/.members



Finally, restart everything for good measure:

/etc/init.d/pvestatd restart
/etc/init.d/pvedaemon restart
/etc/init.d/apache2 restart


Hopefully that helps as I'm just stepping through my bash_history
 
Please use a subject which describes your issue. Otherwise others can not really benefit from your experiences, issues and possible solutions.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!