Proxmox 1.9 Cluster Error

dmentorz

New Member
Oct 8, 2010
13
0
1
Hi,

I have 2 hosts setup with proxmox 1.9, proxmox1 is set as master and proxmox2 is set as a node.
This setup has been working fine for the last 2-3 years. Yesterday the cluster stopped syncing.
After a restart it would sync for about an hour and then i start getting Connection Refused errors on the Master.
it has progressed to the point where both node and master say "nosync" on them selves
and "ERROR: 500 Can't connect to 127.0.0.1:50000 (connect: Connection refused)" on the relevant counter part.

I have to constantly restart ssh and apache2 to gain any form of access to the web gui or shell on the master,
yet on the node everything is running fine.

proxmox2:/dev# pveversion -verbose
pve-manager: 1.9-24 (pve-manager/1.9/6542)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.8-11
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.35-1-pve: 2.6.35-11
qemu-server: 1.1-32
pve-firmware: 1.0-14
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-2pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.1-1
ksm-control-daemon: 1.0-6

The storages i use :

2 iscsi luns and 3 NFS shares on 2 QNAP NAS/iscsi devices.

i have tried searching all over the show for a thread that can help but with no luck.
My problem is that i still have VMs on the master that i need.

Can i delete :

/etc/pve/cluster.cfg
/root/.ssh/known_hosts

on both and recreate the cluster without losing my vms ? (this seems to be the more popular solution)

Thanks
 
Ok, so i have managed to (with alot of persistence) migrate the VMs on the Master over to the node.

Would it be better to try and remove the the node from the cluster with pveca -d or to remove the files and redo the cluster ?

thanks,
 
I suggest you think of upgrading to latest version (currently 2.1). 1.x series is outdated.
 
Hi Tom,

Yes ideally i would like to upgrade to 2.1, my issue at the moment is space. i do know that i can use the update script without having to backup and retore all my VMs.
is there any thing i can check on the current issue aside from upgrading ?

Thanks,
 
check if all your storages are online (e.g. a ll your nfs servers)