dashboard connection timeouts -> WARNING: proxy detected vanished client

rnicolaides · Aug 5, 2014

Dear community,

we have a 4 node cluster, after removing a 5th failed node last weekend.
I tried to readd the failed node, but it didn´t work. So I removed it completly.

The cluster communication is realized over openVPN.
On two nodes there was also a nfs server, with defined shared storage in proxmox, but it was not used at that time.

After a lot of research, here is my problem:

The dahboard is not usable anymore and I can´t fix it nor have I found any information what the problem exactly is.
Some threads in this forum mention similar problems.

Symptoms

VM names are only: no name specified
Loading content stops with: timeout connection
The only log information I found: WARNING: proxy detected vanished client

Things I tried

Empty Browser-Cache
Restart services on all nodes
- pveproxy
- pve-cluster
- nfs-kernel-server
Edit storage.conf and remove the nfs entries
Ensure no backup processes are running
echo "" > /var/log/pve/tasks/active

Questions

Is this still a confirmed bug?: post94889

Do you have any hints, or tips what I could try, or debug?

Technical informations

pvecm status (quite similar on all nodes)

Version: 6.2.0
Config Version: 12
Cluster Name: WR-CLUSTER
Cluster Id: 15060
Cluster Member: Yes
Cluster Generation: 12880
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3
Active subsystems: 5
Flags:
Ports Bound: 0
Node name: bay2
Node ID: 4
Multicast addresses: 239.192.58.15
Node addresses: 10.8.8.2

pveversion

Deleted node: pve-manager/2.3/7946f1f1
node2: pve-manager/3.1-21/93bf03d4 (running kernel: 2.6.32-26-pve)
node7: pve-manager/2.3/7946f1f1
node4: pve-manager/3.2-4/e24a91c1 (running kernel: 2.6.32-30-pve)
node3: pve-manager/3.2-4/e24a91c1 (running kernel: 2.6.32-29-pve)

Edit

I just found out, when I click on Datacenter and then choose a vm located on node2, or node4 I can work with node2 and node4 again. For node3 and node7 the problem remains

Maybe a recopy of pve-ssl.key could fix it? Like in post76741
Although no diff and no such error log

But I am not sure and dont´t want to risk it without confirmation.

rnicolaides · Aug 6, 2014

I fixed it with the following on every node:
And phew

without downtime or vm restart...
No HA enabled

Code:

:# for n in pve-cluster cman pvedaemon pvestatd pve-manager; do /etc/init.d/$n restart; done
:# fusermount -zu /etc/pve
:# /etc/init.d/pve-cluster restart

Now node7 is out of the cluster (red), but all other nodes are green and respond as usual.
All vms on node7 are still running. So I will reinstall it in the next days.

Any chance to readd it without reinstalling?

Mr.Holmes · Aug 6, 2014

Hello rnicolaides

rnicolaides said:
Now node7 is out of the cluster (red), but all other nodes are green and respond as usual.
All vms on node7 are still running. So I will reinstall it in the next days.

Any chance to readd it without reinstalling?

Node7 now runs isolated from the others?

But you never removed node7 from the cluster - or did you?

What shows now

Code:

clustat
pvecm status
pvecm node

on all nodes?

Maybe it shows something suspicious ....

E.g.: are all the IP addresses correct and working?

Kind regards

Mr.Holmes

rnicolaides · Aug 7, 2014

Hello Mr.Holmes,

thank you for replying.

node2

Code:

root@node2:# pvecm nodes
Node  Sts   Inc   Joined               Name
   2   X  12904                        bay7
   3   M  12924   2014-08-06 15:35:52  bay4
   4   M  12904   2014-08-06 15:25:41  bay2
   5   M  12904   2014-08-06 15:25:41  bay3

Code:

root@node2:# clustat
Cluster Status for WR-CLUSTER @ Thu Aug  7 10:21:56 2014
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node7                                                                2 Offline
 node4                                                                3 Online
 node2                                                                4 Online, Local
 node3                                                                5 Online

node7

Code:

 root@node7:# clustat
Cluster Status for WR-CLUSTER @ Thu Aug  7 10:15:52 2014
Member Status: Inquorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node7                                                                2 Offline
 node4                                                                3 Offline
 node2                                                                4 Offline
 node3                                                                5 Offline

Code:

pvecm status and pvecm nodes give the following error:

cman_tool: Error getting extra info: Node is not yet a cluster member

* All IPs are up and working.
* Multicast is allowed

Code:

root@node7:# asmping 224.0.2.1 10.8.8.2
pinging 10.8.8.2 from 10.8.8.7
  unicast from 10.8.8.2, seq=1 dist=0 time=7.183 ms
multicast from 10.8.8.2, seq=1 dist=0 time=47.878 ms
  unicast from 10.8.8.2, seq=2 dist=0 time=3.017 ms
multicast from 10.8.8.2, seq=2 dist=0 time=3.935 ms

This error is in node7 syslog:

Code:

 corosync[413440]:   [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.

Firewall seems to be ok.

Today I realized that login to the dashboard is not possible anymore, even with root@pam. It shows no error, just the login again.

Thanks again,
Ruben Nicolaides

Mr.Holmes · Aug 7, 2014

Hello Ruben,

rnicolaides said:

node7

Code:

 root@node7:# clustat
Cluster Status for WR-CLUSTER @ Thu Aug  7 10:15:52 2014
Member Status: Inquorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 node7                                                                2 Offline
 node4                                                                3 Offline
 node2                                                                4 Offline
 node3                                                                5 Offline

Code:

pvecm status and pvecm nodes give the following error:

cman_tool: Error getting extra info: Node is not yet a cluster member

The above sounds like a contradiction. I guess in node7 something is damaged in cluster-configuration. Since I am just a cluster user but not an expert for internals I cannot say if there is a chance to repair it. Look what https://pve.proxmox.com/wiki/Proxmox_VE_2.0_Cluster suggests in section "Remove a cluster node":

If for whatever reason you would like to make that server to join again the same cluster, you have to

reinstall pve on it from scratch

as a new node

and then regularly join it, as said in the previous section.

The other problem

rnicolaides said:
Today I realized that login to the dashboard is not possible anymore, even with root@pam. It shows no error, just the login again.

I´d try to solve is as described here (even it did not succeed in that particular case):

http://forum.proxmox.com/threads/19046-can-t-Login-with-GUI-but-works-with-Telnet?p=97842#1

All the best!

Mr.Holmes

Search

Search

dashboard connection timeouts -> WARNING: proxy detected vanished client

rnicolaides

New Member

rnicolaides

New Member

Mr.Holmes

Active Member

rnicolaides

New Member

Mr.Holmes

Active Member