Web UI login problem (cluster sync?)

  • Thread starter Thread starter costexx
  • Start date Start date
C

costexx

Guest
Hi.
I'm facing a strange problem whith a 2 node cluster. I was working on a host (setting a password for vnc), when i suddenly lost connection to web ui. After this i cannot auth on node 2 of the cluster (web ui). If i do ssh node2 i can login using that credentials. I observed that i if i login on node1 (ssh) i can do ssh on node2 whithout passwd, but when i try from node2 to node1 is saying WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!. So i think sync is broken..
What should i do. The servers are in production.

pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-15
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1


pvecm status
Version: 6.2.0
Config Version: 2
Cluster Name: PMX-CLUSTER
Cluster Id: 58580
Cluster Member: Yes
Cluster Generation: 224
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: pmx2
Node ID: 1
Multicast addresses: 239.192.228.185
Node addresses: .......(IP)

pvecm nodes
Node Sts Inc Joined Name
1 M 220 2012-12-27 02:21:33 pmx2
2 M 224 2012-12-27 02:23:05 pmx1
 
Last edited by a moderator:
Ok i have some updates.
I removed the offending key from known_hosts, and i could login between nodes. From web interface on node2 i still couldn't login.
From node1 i observed that when i clicked on node 2 it said cannot connect to localhost on port 85. Looking on node2 I saw that pvedaemon was stoped. After i restarted pvedaemon everything started working. I could login on node1 or node2 and it looks ok, but in syslog on node2 I see this:
Jan 18 22:57:20 pmx2 pmxcfs[265135]: [status] crit: cpg_send_message failed: 9
I previously restarted cman and pve-cluster.
How could i fix this?
Tx.
 
Last edited by a moderator:
You usually see hat only a few times when pve-cluster runs before cman starts.

Yes you were right. I restarted pve-cluster and the errors message stopped showing.

Thank you for your help.

Ps. Is there a order in which we should restart services? cman, pve-cluster, pve-daemon... ?