Impossible to see cluster's nodes in GUI

hammer78

Member
Apr 11, 2018
7
0
6
Hi all,

Can somebody help me to find why i cannot access to nodes parts (summary,...) in the cluster GUI.
I always have "communication failure (0)" when try to have a look on nodes.

The cluster is on another network but this network is on the same subnet.

here are network configuration:

master:
auto vmbr10
iface vmbr10 inet static
address 10.88.xxx.xxx
netmask 255.255.255.240
bridge_ports enp132s0f0.2001
bridge_stp off
bridge_fd 0
pre-up ip link set enp132s0f0 mtu 9000
post-up /sbin/ip route add 10.88.0.0/14 via 10.88.xxx.www

secoundary:
auto vmbr10
iface vmbr10 inet static
address 10.88.xxx.yyy
netmask 255.255.255.240
bridge_ports enp1s0f1.2002
bridge_stp off
bridge_fd 0
pre-up ip link set enp1s0f1 mtu 9000
post-up /sbin/ip route add 10.88.0.0/14 via 10.88.xxx.www


So cluster was created with pvecm create cluster -bindnet0_addr 10.88.xxx.xxx -ring0_addr 10.88.xxx.xxx
nodes were added with pvecm add 10.88.xxx.xxx -ring0_addr 10.88.xxx.yyy

have tested ping -f on each node and results looks ok:

from master node:
--- 10.88.xxx.yyy ping statistics ---

337181 packets transmitted, 337181 received, 0% packet loss, time 35954ms

rtt min/avg/max/mdev = 0.050/0.089/0.858/0.010 ms, ipg/ewma 0.106/0.090 ms

from secoundary node:
--- 10.88.xxx.xxx ping statistics ---

637333 packets transmitted, 637332 received, 0% packet loss, time 67855ms

rtt min/avg/max/mdev = 0.049/0.100/0.880/0.014 ms, ipg/ewma 0.106/0.101 ms

have tested multicast:

omping -c 600 -i 1 -q 10.88.xxx.xxx 10.88.xxx.yyy

and have this results

master:
10.88.xxx.yyy : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.094/0.447/0.542/0.090

10.88.xxx.yyy : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.133/0.457/0.553/0.079

secoundary:
10.88.xxx.xxx : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.115/0.379/0.461/0.079

10.88.xxx.xxx : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.135/0.380/0.461/0.076

Have also tested
omping -c 10000 -i 0.001 -F -q 10.88.xxx.xxx 10.88.xxx.yyy

master:
10.88.xxx.yyy : unicast, xmt/rcv/%loss = 9983/9983/0%, min/avg/max/std-dev = 0.064/0.172/0.327/0.057

10.88.xxx.yyy : multicast, xmt/rcv/%loss = 9983/9983/0%, min/avg/max/std-dev = 0.068/0.181/0.350/0.061

secoundary:
10.88.xxx.xxx : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.076/0.210/0.519/0.075

10.88.xxx.xxx : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.086/0.213/0.511/0.073

but all looks

As i saw into the forum, i reinstalled all nodes and activate unicats with adding this in corosync.conf after creating the cluster:

"transport: udpu" in totem{} and elevate the version.

Node was restarted after that modification. But the problem is still there.

Have also as i saw into forum regenerate certs with "pvecm updatecerts" but still have problems.

Does someone have an idea?

Thanks in advance
 
here ncat tests

from master:
nc -zvu pve-bobafett 5400-5410

pve-bobafett [10.88.xxx.yyy] 5405 (?) open

pve-bobafett [10.88.xxx.yyy] 5404 (?) open

from secoundary

nc -zvu pve-dooku 5400-5410

pve-dooku [10.88.xxx.xxx] 5405 (?) open

pve-dooku [10.88.xxx.xxx] 5404 (?) open
 
Syslog when running pvecm updatecerts -force:

May 14 15:20:00 DOOKU systemd[1]: Starting Proxmox VE replication runner...
May 14 15:20:01 DOOKU systemd[1]: Started Proxmox VE replication runner.
May 14 15:20:36 DOOKU pvedaemon[20235]: starting termproxy UPID:DOOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam:
May 14 15:20:36 DOOKU pvedaemon[1941]: <root@pam> starting task UPID:DOOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam:
May 14 15:20:37 DOOKU pvedaemon[1942]: <root@pam> successful auth for user 'root@pam'
May 14 15:20:37 DOOKU login[20241]: pam_unix(login:session): session opened for user root by root(uid=0)
May 14 15:20:37 DOOKU systemd-logind[997]: New session 6 of user root.
May 14 15:20:37 DOOKU systemd[1]: Started Session 6 of user root.
May 14 15:20:37 DOOKU login[20246]: ROOT LOGIN on '/dev/pts/1'
May 14 15:20:45 DOOKU systemd-logind[997]: Removed session 6.
May 14 15:20:45 DOOKU pvedaemon[1941]: <root@pam> end task UPID:DOOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam: OK
May 14 15:21:00 DOOKU systemd[1]: Starting Proxmox VE replication runner...
May 14 15:21:01 DOOKU systemd[1]: Started Proxmox VE replication runner.
 
Last edited:
but found some of those lines after trying to access to "summary" GUI part of another node:

pveproxy[20564]: proxy detected vanished client connection
 
For the different VLANs, are you routing the traffic from those interfaces? What does 'pvecm status' say?
 
pvecm status

Quorum information

------------------

Date: Tue May 15 12:25:25 2018

Quorum provider: corosync_votequorum

Nodes: 2

Node ID: 0x00000001

Ring ID: 1/60

Quorate: Yes


Votequorum information

----------------------

Expected votes: 2

Highest expected: 2

Total votes: 2

Quorum: 2

Flags: Quorate


Membership information

----------------------

Nodeid Votes Name

0x00000001 1 10.88.xxx.xxx (local)

0x00000002 1 10.88.xxx.yyy

And no, i'm not routing them. My hosting provider do it for private network between nodes.

I also try with the same VLAN ID, but problem still the same...
 
My hosting provider do it for private network between nodes.
So you don't have the network under control. There can be anything, from blocked ports, to latency issues. Whatever it is, this is up to the providers configuration.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!