Cluster created: No access to node and no cluster join possible

drflo

New Member
Aug 6, 2020
7
0
1
32
Hi guys,

I created a new cluster with v6.2 with a separate cluster network. The cluster network is within an own VLAN. The cluster creation with two nodes was successful and I can see the two nodes in the web interface with a green checkmark.

The problem now is that there are is no possibility anymore to add another node via the GUI. In addition every time when I try to get information about the second node in the UI I get connection errors. Also the seconds node is not reachable anymore via the web interface.

Bildschirmfoto 2020-08-06 um 10.20.19.png
Bildschirmfoto 2020-08-06 um 10.24.07.png

Ping is possible between the two nodes. As well shell access from node1 to node 2.

pvme status
Code:
Cluster information
-------------------
Name:             XXX
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Aug  6 10:21:53 2020
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.43
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.XX.XX.XX (local)
0x00000002          1 10.XX.XX.XX

Would appreciate any help. Thanks a lot.

All the best,
Florian
 
does SSH access work between both nodes (in both directions)? is the firewall active?
 
yes, I have shell access to both nodes via the GUI (from node 1). Within shell I can ping them in both directions. Firewall is not configured on node2 (brand new installation). Node1 has firewall access for pveGUIaccess. But I disabled this already for validation.
 
I really meant SSH access from node A to node B and vice-versa
 
Okay, sorry. Now I tried it out and yes it's working with

Code:
ssh root@pve003

and vice-versa.
 
anything in the pveproxy logs (journalctl -u pveproxy)?
 
node1 looks good I think

Code:
-- Logs begin at Thu 2020-08-06 09:54:37 CEST, end at Thu 2020-08-06 16:04:46 CEST. --
Aug 06 09:54:43 pve001 systemd[1]: Starting PVE API Proxy Server...
Aug 06 09:54:45 pve001 pveproxy[1203]: starting server
Aug 06 09:54:45 pve001 pveproxy[1203]: starting 3 worker(s)
Aug 06 09:54:45 pve001 pveproxy[1203]: worker 1204 started
Aug 06 09:54:45 pve001 pveproxy[1203]: worker 1205 started
Aug 06 09:54:45 pve001 pveproxy[1203]: worker 1206 started
Aug 06 09:54:45 pve001 systemd[1]: Started PVE API Proxy Server.
Aug 06 10:11:54 pve001 pveproxy[1205]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:16 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:19 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:21 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:27 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:27 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:23:46 pve001 pveproxy[1204]: proxy detected vanished client connection
Aug 06 10:23:49 pve001 pveproxy[1206]: proxy detected vanished client connection
Aug 06 10:23:51 pve001 pveproxy[1206]: proxy detected vanished client connection
Aug 06 10:23:57 pve001 pveproxy[1206]: proxy detected vanished client connection
Aug 06 10:23:57 pve001 pveproxy[1206]: proxy detected vanished client connection
Aug 06 10:24:52 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:25:22 pve001 pveproxy[1206]: proxy detected vanished client connection
Aug 06 10:27:10 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:30:05 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:31:44 pve001 pveproxy[1205]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:33:21 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:34:59 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:36:43 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:38:30 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:40:32 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:42:12 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:43:50 pve001 pveproxy[1205]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:45:31 pve001 pveproxy[1206]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:48:29 pve001 pveproxy[1204]: '/etc/pve/nodes/pve003/pve-ssl.pem' does not exist!
Aug 06 10:48:59 pve001 pveproxy[1204]: proxy detected vanished client connection

But you are right ... node2 looks not healthy.

Code:
Aug 06 09:46:17 pve003 pveproxy[1768]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:17 pve003 pveproxy[1769]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:21 pve003 pveproxy[1767]: worker exit
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1767 finished
Aug 06 09:46:22 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1770 started
Aug 06 09:46:22 pve003 pveproxy[1768]: worker exit
Aug 06 09:46:22 pve003 pveproxy[1770]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:22 pve003 pveproxy[1769]: worker exit
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1768 finished
Aug 06 09:46:22 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1771 started
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1769 finished
Aug 06 09:46:22 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:22 pve003 pveproxy[1208]: worker 1772 started
Aug 06 09:46:22 pve003 pveproxy[1771]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:22 pve003 pveproxy[1772]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:27 pve003 pveproxy[1770]: worker exit
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1770 finished
Aug 06 09:46:27 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1811 started
Aug 06 09:46:27 pve003 pveproxy[1771]: worker exit
Aug 06 09:46:27 pve003 pveproxy[1811]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:27 pve003 pveproxy[1772]: worker exit
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1771 finished
Aug 06 09:46:27 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1812 started
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1772 finished
Aug 06 09:46:27 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:27 pve003 pveproxy[1208]: worker 1813 started
Aug 06 09:46:27 pve003 pveproxy[1812]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:27 pve003 pveproxy[1813]: /etc/pve/local/pve-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl
Aug 06 09:46:32 pve003 pveproxy[1811]: worker exit
Aug 06 09:46:32 pve003 pveproxy[1208]: worker 1811 finished
Aug 06 09:46:32 pve003 pveproxy[1208]: starting 1 worker(s)
Aug 06 09:46:32 pve003 pveproxy[1208]: worker 1814 started

how can I fix this? I really appreciate your help.
 
Interesting. I copied the key_files from /etc/pve/nodes/pve001 to the other /etc/pve/nodes/pve003/ and now it works.

Is this correct or just working accidentally?
 
'pvecm updatecerts -f' on pve003 will regenerate them properly.