Cluster degraded what can I do

tdi

Guest
Hello,

I was changing network configuration within my 6 node cluster and it is now degraded. Some nodes are in cluster, some of them are alone. Is it enough to just copy /var/lib/pve-cluster/config.db from one node to all other ?
 

tom

Proxmox Staff Member
Staff member
Aug 29, 2006
15,525
911
163
changing networks in a running cluster is not really a good idea.

without knowing all details I cannot tell you the best way to recovery.
 

tdi

Guest
Maybe this will help. 5 nodes have config.db of one version, one of the cluster's differs.

What I changed: was: eth5 with ip X, added vmbr1 with eth5 and ip X. I checked for multicast connectivity it is ok.

Now I have a situation where I have "two" clusters:
One with

root@skynet1:~# pvecm nodes
Node Sts Inc Joined Name
1 M 354216 2013-02-21 15:47:24 skynet6
2 X 0 skynet5
3 M 354216 2013-02-21 15:47:24 skynet4
4 X 0 skynet3
5 M 354216 2013-02-21 15:47:24 skynet2
6 M 353888 2013-02-21 15:41:10 skynet1

and a second one:



root@skynet3:~# pvecm nodes
Node Sts Inc Joined Name
1 X 0 skynet6
2 M 354368 2013-02-21 15:55:07 skynet5
3 X 0 skynet4
4 M 354368 2013-02-21 15:55:07 skynet3
5 X 0 skynet2
6 X 0 skynet1


pvecm delnode does not work:

root@skynet3:~# pvecm delnode skynet6
I/O error : Resource busy
I/O error : Resource busy
ccs_tool: Error writing new config file /etc/pve/cluster.conf
 

hotwired007

Member
Sep 19, 2011
533
6
16
UK
sounds like the skynet 5 and 3 nodes need to be powered down and then restarted so that they can be nrought back into the cluster - or shut them all down and power them on indiviualy so that that can only detect the one primary cluster.
 

tdi

Guest
sounds like the skynet 5 and 3 nodes need to be powered down and then restarted so that they can be nrought back into the cluster - or shut them all down and power them on indiviualy so that that can only detect the one primary cluster.

Yes, I tried that many times. Now I am trying to delete a node from cluster and add it again. But eveytime I get the error mentioned above.
 

dietmar

Proxmox Staff Member
Staff member
Apr 28, 2005
17,114
513
133
Austria
www.proxmox.com
Yes, I tried that many times. Now I am trying to delete a node from cluster and add it again. But eveytime I get the error mentioned above.

You can only modify data one /etc/pve if the cluster is healthy.
 

tdi

Guest
You can only modify data one /etc/pve if the cluster is healthy.


So what is now the most sane solution ? My VMs disks are on iSCSI, how I can recreate whole cluster from scratch and readd vms?
 

tom

Proxmox Staff Member
Staff member
Aug 29, 2006
15,525
911
163
you need to find the cause, fix it. if you cannot figure it out, we have several support options (including SSH login) and can assist.

fixing a mis-configured cluster could be a challenging task and as no-one really knows whats wrong on your boxes/network all you can get here are answers to your questions but not a step-by-step howto fix your problem.
 

tdi

Guest
Ok my bad, you are right, I should have asked a proper question.

How can I force force cluster node out of cluster when pvecm delnode does not work because cluster is degraded? When I switch a node off, it still can be visible on the list because of cluster.conf, which I cannot edit, because cluster is degraded. I am not trying to make you fix it, I am trying to get some general direction.
 

tdi

Guest
You cluster is degraded on all 6 Nodes?

No, I achieved a state where 4 nodes are in cluster together, two are out of cluster (they are separate clusters, node 4 and 1). I set for these 4 nodes expected votes = 4 but still deleting these ghost nodes is impossible. I also tried to reset node 4 to "factory" defaults and after still it does not want to join the cluster. I have no problems with connectivity - checked every possible way. I am coming to suspect that something is wrong with connectivity through a bridge.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!