Cluster is crash BAD- no recovery - Reinstall proxmox won't help.

weester

New Member
May 21, 2012
20
1
3
Hi,
I had 2 nodes Prox cluster. Node2 crashed. I removed node2 from cluster. Reload a new prox node called node3. Try to add node3 to node1 cluster. I got "can't copy ssh id" error, and all cluster configs file turn READ-ONLY on node1. Each time I run pvecm status on node1, i get this error
"cman_tool: Cannot open connection to cman, is it running ?"

I also try to remove cluster on node1 with these steps below on node1 but no help, same error above.
service pve-cluster stop
rm /etc/cluster/cluster.conf
rm -rf /var/lib/pve-cluster/*
rm -rf /etc/pve/nodes/*
service pve-cluster start
service cman start # This do nothing quietly


Finally reload proxmox on node1 and still see same error. Node 3 seems to run fine.
How can i completely remove cluster from node1 and node 3.

thank you and sorry for my english.
Joe.
 
Hi,
I had 2 nodes Prox cluster. Node2 crashed. I removed node2 from cluster. Reload a new prox node called node3. Try to add node3 to node1 cluster. I got "can't copy ssh id" error, and all cluster configs file turn READ-ONLY on node1.

This is because you do not have quorum. As workaround, you can set the expected votes with "pvecm expected <N>"
 
Thank you very much for your help but I tried that and it is still not much help. It gives me this error when I run pvecm expected 1

root@qdve02:/etc/pve/priv# service cman restart
root@qdve02:/etc/pve/priv# pvecm status
cman_tool: Cannot open connection to cman, is it running ?
root@qdve02:/etc/pve/priv#
root@qdve02:/etc/pve/priv# pvecm e 1
cman_tool: Cannot open connection to cman, is it running ?


thank you once again,
weester

This is because you do not have quorum. As workaround, you can set the expected votes with "pvecm expected <N>"
 
I tried to restart both nodes but not help. The node3 restarted fine but node2 give me same error.

root@qdve02:/etc/init.d# ./pve-cluster status
Checking status of pve cluster filesystem: pve-cluster running.
root@qdve02:/etc/init.d# ./pve-cluster restart
Restarting pve cluster filesystem: pve-cluster.
root@qdve02:/etc/init.d# service cman status
root@qdve02:/etc/init.d# service cman restart
root@qdve02:/etc/init.d# pvecm status
cman_tool: Cannot open connection to cman, is it running ?
root@qdve02:/etc/init.d# service cman status
root@qdve02:/etc/init.d#



thank you very much for your help.
Joe
 
You installed those nodes using our CD, or is that a custom installation?

What is the output of

# pveversion -v
 
I installed it using your CD. I did not know there is such as custom installation. Here is the output of pveversion -v

root@qdve02:~# pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-15
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1

thank you for your help.
JOe



You installed those nodes using our CD, or is that a custom installation?

What is the output of

# pveversion -v
 
Looks like there is no way to recover crash cluster/ha. Is there a way to do a daily backup of quorum?