Impossible to see cluster's nodes in GUI

hammer78 · May 14, 2018

Hi all,

Can somebody help me to find why i cannot access to nodes parts (summary,...) in the cluster GUI.
I always have "communication failure (0)" when try to have a look on nodes.

The cluster is on another network but this network is on the same subnet.

here are network configuration:

master:
auto vmbr10
iface vmbr10 inet static
address 10.88.xxx.xxx
netmask 255.255.255.240
bridge_ports enp132s0f0.2001
bridge_stp off
bridge_fd 0
pre-up ip link set enp132s0f0 mtu 9000
post-up /sbin/ip route add 10.88.0.0/14 via 10.88.xxx.www

secoundary:
auto vmbr10
iface vmbr10 inet static
address 10.88.xxx.yyy
netmask 255.255.255.240
bridge_ports enp1s0f1.2002
bridge_stp off
bridge_fd 0
pre-up ip link set enp1s0f1 mtu 9000
post-up /sbin/ip route add 10.88.0.0/14 via 10.88.xxx.www

So cluster was created with pvecm create cluster -bindnet0_addr 10.88.xxx.xxx -ring0_addr 10.88.xxx.xxx
nodes were added with pvecm add 10.88.xxx.xxx -ring0_addr 10.88.xxx.yyy

have tested ping -f on each node and results looks ok:

from master node:
--- 10.88.xxx.yyy ping statistics ---

337181 packets transmitted, 337181 received, 0% packet loss, time 35954ms

rtt min/avg/max/mdev = 0.050/0.089/0.858/0.010 ms, ipg/ewma 0.106/0.090 ms

from secoundary node:
--- 10.88.xxx.xxx ping statistics ---

637333 packets transmitted, 637332 received, 0% packet loss, time 67855ms

rtt min/avg/max/mdev = 0.049/0.100/0.880/0.014 ms, ipg/ewma 0.106/0.101 ms

have tested multicast:

omping -c 600 -i 1 -q 10.88.xxx.xxx 10.88.xxx.yyy

and have this results

master:
10.88.xxx.yyy : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.094/0.447/0.542/0.090

10.88.xxx.yyy : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.133/0.457/0.553/0.079

secoundary:
10.88.xxx.xxx : unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.115/0.379/0.461/0.079

10.88.xxx.xxx : multicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.135/0.380/0.461/0.076

Have also tested
omping -c 10000 -i 0.001 -F -q 10.88.xxx.xxx 10.88.xxx.yyy

master:
10.88.xxx.yyy : unicast, xmt/rcv/%loss = 9983/9983/0%, min/avg/max/std-dev = 0.064/0.172/0.327/0.057

10.88.xxx.yyy : multicast, xmt/rcv/%loss = 9983/9983/0%, min/avg/max/std-dev = 0.068/0.181/0.350/0.061

secoundary:
10.88.xxx.xxx : unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.076/0.210/0.519/0.075

10.88.xxx.xxx : multicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.086/0.213/0.511/0.073

but all looks

As i saw into the forum, i reinstalled all nodes and activate unicats with adding this in corosync.conf after creating the cluster:

"transport: udpu" in totem{} and elevate the version.

Node was restarted after that modification. But the problem is still there.

Have also as i saw into forum regenerate certs with "pvecm updatecerts" but still have problems.

Does someone have an idea?

Thanks in advance

Alwin · May 14, 2018

hammer78 said:
bridge_ports enp132s0f0.2001

hammer78 said:
bridge_ports enp1s0f1.2002

Both nodes are using different VLANs. Can you access from each node the other by ssh via name & IP?

hammer78 · May 14, 2018

yes of course

all tests were made with that settings

ping and ssh is working with IP's and names

hammer78 · May 14, 2018

here ncat tests

from master:
nc -zvu pve-bobafett 5400-5410

pve-bobafett [10.88.xxx.yyy] 5405 (?) open

pve-bobafett [10.88.xxx.yyy] 5404 (?) open

from secoundary

nc -zvu pve-dooku 5400-5410

pve-dooku [10.88.xxx.xxx] 5405 (?) open

pve-dooku [10.88.xxx.xxx] 5404 (?) open

Alwin · May 14, 2018

hammer78 said:
pvecm updatecerts

Try the update with '-force'.

And what are the logs telling (syslog/journal)?

hammer78 · May 14, 2018

Syslog when running pvecm updatecerts -force:

May 14 15:20:00 DOOKU systemd[1]: Starting Proxmox VE replication runner...
May 14 15:20:01 DOOKU systemd[1]: Started Proxmox VE replication runner.
May 14 15:20:36 DOOKU pvedaemon[20235]: starting termproxy UPID

OOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam:
May 14 15:20:36 DOOKU pvedaemon[1941]: <root@pam> starting task UPID

OOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam:
May 14 15:20:37 DOOKU pvedaemon[1942]: <root@pam> successful auth for user 'root@pam'
May 14 15:20:37 DOOKU login[20241]: pam_unix(login:session): session opened for user root by root(uid=0)
May 14 15:20:37 DOOKU systemd-logind[997]: New session 6 of user root.
May 14 15:20:37 DOOKU systemd[1]: Started Session 6 of user root.
May 14 15:20:37 DOOKU login[20246]: ROOT LOGIN on '/dev/pts/1'
May 14 15:20:45 DOOKU systemd-logind[997]: Removed session 6.
May 14 15:20:45 DOOKU pvedaemon[1941]: <root@pam> end task UPID

OOKU:00004F0B:000911A8:5AF98D24:vncshell::root@pam: OK
May 14 15:21:00 DOOKU systemd[1]: Starting Proxmox VE replication runner...
May 14 15:21:01 DOOKU systemd[1]: Started Proxmox VE replication runner.

hammer78 · May 14, 2018

but found some of those lines after trying to access to "summary" GUI part of another node:

pveproxy[20564]: proxy detected vanished client connection

Alwin · May 15, 2018

For the different VLANs, are you routing the traffic from those interfaces? What does 'pvecm status' say?

hammer78 · May 15, 2018

pvecm status

Quorum information

------------------

Date: Tue May 15 12:25:25 2018

Quorum provider: corosync_votequorum

Nodes: 2

Node ID: 0x00000001

Ring ID: 1/60

Quorate: Yes

Votequorum information

----------------------

Expected votes: 2

Highest expected: 2

Total votes: 2

Quorum: 2

Flags: Quorate

Membership information

----------------------

Nodeid Votes Name

0x00000001 1 10.88.xxx.xxx (local)

0x00000002 1 10.88.xxx.yyy

And no, i'm not routing them. My hosting provider do it for private network between nodes.

I also try with the same VLAN ID, but problem still the same...

Alwin · May 15, 2018

hammer78 said:
My hosting provider do it for private network between nodes.

So you don't have the network under control. There can be anything, from blocked ports, to latency issues. Whatever it is, this is up to the providers configuration.

hammer78 · May 15, 2018

hi as you can see into exported files, all ports are open and avialable on both nodes

any idea

Search

Search

Impossible to see cluster's nodes in GUI

hammer78

Member

Alwin

Proxmox Retired Staff

hammer78

Member

hammer78

Member

Alwin

Proxmox Retired Staff

hammer78

Member

hammer78

Member

Alwin

Proxmox Retired Staff

hammer78

Member

Alwin

Proxmox Retired Staff

hammer78

Member

Attachments

We value your privacy