Web UI login problem (cluster sync?)

C

costexx

Guest
Hi.
I'm facing a strange problem whith a 2 node cluster. I was working on a host (setting a password for vnc), when i suddenly lost connection to web ui. After this i cannot auth on node 2 of the cluster (web ui). If i do ssh node2 i can login using that credentials. I observed that i if i login on node1 (ssh) i can do ssh on node2 whithout passwd, but when i try from node2 to node1 is saying WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!. So i think sync is broken..
What should i do. The servers are in production.

pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-7-pve: 2.6.32-60
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-15
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1


pvecm status
Version: 6.2.0
Config Version: 2
Cluster Name: PMX-CLUSTER
Cluster Id: 58580
Cluster Member: Yes
Cluster Generation: 224
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: pmx2
Node ID: 1
Multicast addresses: 239.192.228.185
Node addresses: .......(IP)

pvecm nodes
Node Sts Inc Joined Name
1 M 220 2012-12-27 02:21:33 pmx2
2 M 224 2012-12-27 02:23:05 pmx1
 
Last edited by a moderator:
Ok i have some updates.
I removed the offending key from known_hosts, and i could login between nodes. From web interface on node2 i still couldn't login.
From node1 i observed that when i clicked on node 2 it said cannot connect to localhost on port 85. Looking on node2 I saw that pvedaemon was stoped. After i restarted pvedaemon everything started working. I could login on node1 or node2 and it looks ok, but in syslog on node2 I see this:
Jan 18 22:57:20 pmx2 pmxcfs[265135]: [status] crit: cpg_send_message failed: 9
I previously restarted cman and pve-cluster.
How could i fix this?
Tx.
 
Last edited by a moderator:
You usually see hat only a few times when pve-cluster runs before cman starts.

Yes you were right. I restarted pve-cluster and the errors message stopped showing.

Thank you for your help.

Ps. Is there a order in which we should restart services? cman, pve-cluster, pve-daemon... ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!