cluster.conf wrong, prevents cman start .. HELP!

RobFantini

Famous Member
May 24, 2012
2,043
111
133
Boston,Mass
Hello.
We had a 3 node cluster, and I mistakenly thought I could leave 3 nodes and use this in my cluster.conf:
Code:
<cman two_node="1" expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey"/>

Now at reboot cman will not start. the error:
Code:
   Starting cman... two_node set but there are more than 2 nodes
cman_tool: corosync daemon didn't start Check cluster logs for details

So /etc/pve is mounted read only on the 3 nodes.

So I can not fix cluster.conf

Is there a way to get around this?
 
OK I removed all nodes from the cluster as we need some of the vz and kvm to work . I followed http://forum.proxmox.com/threads/9595-One-question-about-clustering-in-Proxmox-2-x?p=54508#post54508 , which has worked for us in the past. However I could not remove the nodes using pvecm delnode as cman was not running.

So starting with no cluster I created cluster on the main node.

then when i run pvecm add , it returns 'authentication key already exists'

Is there somewhere that the old node key can be deleted?
 
Last edited:
for the future others could use the info on how to delete an existing key..

I'll try changing hostname and i/p address and drbd addresses...
 
well that did nt work..
Code:
pvecm add 10.100.100.243
authentication key already exists

then I tried a shot in the dark - recreating root's ssh key with ssh-keygen on the node to be added... that did not solve the issue..

so I'll waint until someone can answer for:

how can I solve "authentication key already exists" ?
 
try 'pvecm add 10.100.100.243 -force'
 
try 'pvecm add 10.100.100.243 -force'

It did not work:
Code:
fbc241 s012 ~ # pvecm add 10.100.100.243 -force
I/O warning : failed to load external entity "/etc/pve/cluster.conf"
ccs_tool: Error: unable to parse requested configuration file

command 'ccs_tool lsnode -c /etc/pve/cluster.conf' failed: exit code 1
unable to add node: command failed (ssh 10.100.100.243 -o BatchMode=yes pvecm addnode fbc241 --force 1)

/etc/pve/cluster.conf :
Code:
fbc241 s012 ~ # cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster name="clusterfbc" config_version="1">

  <cman keyfile="/var/lib/pve-cluster/corosync.authkey">
  </cman>

  <clusternodes>
  <clusternode name="fbc241" votes="1" nodeid="1"/>
  </clusternodes>

</cluster>
 
Hello Tom : here is our version info:
Code:
fbc241 s012 ~ # pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-12-pve
proxmox-ve-2.6.32: 2.1-68
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-12-pve: 2.6.32-68
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
 
I think this would reproduce it:

1- make a 3 node cluster.

2- add this:
Code:
<cman two_node="1" expected_votes="1"> </cman>

if that does not break the cluster then

3- restart all nodes
 
if someone runs in to this, try mounting /etc/pve in local mode then edit /etc/pve/cluster.conf
try
Code:
pmxcfs  --local    

#may need to do this 1-st
umount -f /etc/pve