Adding 6.2 nodes to 6.0 cluster - Error 500 Can't connect to x.x.x.x:8006 (hostname verification failed)

Dan S

Member
Nov 12, 2020
5
0
6
47
Hi.

I'm adding new nodes to a 6.0 Proxmox + CEPH cluster.
The new nodes are with the version 6.2
The idea is to add the new 6.2 nodes to the actual 6.0 cluster, and the upgrade the 6.0 nodes to 6.2. Finally jump to version 6.3

But when i try to add a new node (v6.2) to the 6.0 cluster i get a verification error (i suppose is related to the certificate):

Bash:
root@newnode3:~# pvecm add x.x.x.4
Please enter superuser (root) password for 'x.x.x.4': ********
Establishing API connection with host 'x.x.x.4'
The authenticity of host 'x.x.x.4' can't be established.
X509 SHA256 key fingerprint is XX:XX:XX:XX:XX:XX:XX:XX..........................
Are you sure you want to continue connecting (yes/no)? yes
500 Can't connect to x.x.x.4:8006 (hostname verification failed)

The /etc/hosts has the correct name/ip and it's the same all the hosts (running ones and the ones to be added).

Shoudl i add it with the --use_ssh option to skip the verification?

pvecm add x.x.x.4 --use_ssh 1

What should i do to skip the problem and add the new nodes to the cluster?

Thanks in advance.

Daniel.
 
Last edited:
Hi, I have thew same problem. Two nodes in cluster last enterprise update and third can not be added via cli. Adding via GUI breaks cluster ...
Any idea ? Is this SAFE ?

Code:
pvecm add x.x.x.4 --use_ssh 1
 
I have also tried to run on working cluster and even on the new node, not helped ....:

Code:
pvecm updatecerts
 
UPDATE:

I have just solved the problem with adding all servers in cluster to /etc/hosts on the new node and after I was able to add new node to cluster by hostname:

Code:
pvecm add hostname-no-ip-of-some-working-node
 
UPDATE:

I have just solved the problem with adding all servers in cluster to /etc/hosts on the new node and after I was able to add new node to cluster by hostname:

Code:
pvecm add hostname-no-ip-of-some-working-node

I already have the /etc/host in all the servers reflecting the names and IP.
So you added with the name of one server already in the cluster, as reflected in /etc/hosts instread of the IP that is the usual, no?

So if the exisisting server in /etc/hosts

1.1.1.10 server10 1.1.1.11 newserver11

you used from the new server (1.1.1.11)

Code:
pvecm add server10

instead of

Code:
pvecm add 1.1.1.10

And that solved the problem. Is it ok?

Thanks a lot.

Daniel.
 
I tried that and it didn't worked for me. Is there something else I can do?
It worked with no problems.

Have you added the hostnames en every /etc/hosts on all the nodes, and added the new node from command line with "pvecm add ip.actual.node" from the new node?

Have you checked that can log with SSH from the new node to the actual ones?
 
This bug does not exist anymore, if your system is up-to-date ... So my questions are: Do you have:

1. /etc/hosts - correct values on all hosts ?
2. all hosts are up-to-date ?
 
Last edited:
I don't have all hosts on /etc/hosts of all hosts. I just added them to the node I want to join.
The cluster is on 6.2-12 while the new node is on 6.2-4
Previously I tried the new node on 6.3-1 but that didn't worked, so I just downgraded to the latest 6.2 is I found.

Indeed the ssh may be the problem:


Code:
root@proxmox-virtual:~# ssh -vv virtual
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1d  10 Sep 2019
debug1: Reading configuration data /root/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: resolving "virtual" port 22
debug2: ssh_connect_direct
debug1: Connecting to virtual [192.168.0.6] port 22.
debug1: Connection established.
debug1: identity file /root/.ssh/id_rsa type 0
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: identity file /root/.ssh/id_xmss type -1
debug1: identity file /root/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u2


that is from the new node I want to join to the cluster original node
 
Guys, has anybody managed to address this? I have exactly the same issue, trying to add 2 more nodes with the same version (6.4-13) and getting:
root@pve6:~# pvecm add pve1
Please enter superuser (root) password for 'pve1': *******************
Establishing API connection with host 'pve1'
500 Can't connect to pve1:8006 (hostname verification failed)

I used the "pvecm add pve1" previously and it was working fine (current cluster contains 4 nodes)


/etc/hosts contains all correct records on ALL nodes (old and new ones). I tested with SSH from all nodes to all nodes - all good, Network is stable.
 
guys, I've figured this out - it's because of Let's Encrypt SSL certificate installed on the cluster node thus you need to use covered server's hostname (in my case it's pve1.mydomain.com):

Bash:
root@pve6:~# pvecm add pve1.mydomain.com
Please enter superuser (root) password for 'pve1.mydomain.com': *******************
Establishing API connection with host 'pve1.mydomain.com'
Login succeeded.
check cluster join API version
No cluster network links passed explicitly, fallback to local node IP '192.168.200.246'
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1632474152.sql.gz'
waiting for quorum...OK
(re)generate node files
generate new node certificate
merge authorized SSH keys and known hosts
generated new node certificate, restart pveproxy and pvedaemon services
successfully added node 'pve6' to cluster.

hope someone will find it useful ;)
 
pvecm add node1.domain.com -force

Using the ip of node, I got errors. (Proxmox 6.4-13)

worked for me with hostname !
Thanks
 
  • Like
Reactions: alterman1994
Hi,
I had also the same issue.
All nodes in /etc/hosts with correct IP (for me public IP with lets'encrypt certificates and private IP for the quorum in a dedicated vlan)
using short name to join the cluster failed
using FQDN resolve the issue and my new node si now part of the cluster.
Thx for this post
 
This is still existent on pve 7.2 when adding a node via cli and IP.

Please enter superuser (root) password for '10.2.121.212': ********
Establishing API connection with host '10.2.121.212'
The authenticity of host '10.2.121.212' can't be established.
X509 SHA256 key fingerprint is 20:A0:2F:64:AC:03:6B:1B:1F:4D:5D:55:E4:9B:7B:B1:7F:A9:3B:27:F4:1A:C6:35:B2:1C:58:DE:A8:E6:A6:F9.
Are you sure you want to continue connecting (yes/no)?
TASK ERROR: 500 Can't connect to 10.2.121.212:8006 (hostname verification failed)

pvecm add hostnameHostname works without problems.
 
This is still existent on pve 7.2 when adding a node via cli and IP.

Please enter superuser (root) password for '10.2.121.212': ********
Establishing API connection with host '10.2.121.212'
The authenticity of host '10.2.121.212' can't be established.
X509 SHA256 key fingerprint is 20:A0:2F:64:AC:03:6B:1B:1F:4D:5D:55:E4:9B:7B:B1:7F:A9:3B:27:F4:1A:C6:35:B2:1C:58:DE:A8:E6:A6:F9.
Are you sure you want to continue connecting (yes/no)?
TASK ERROR: 500 Can't connect to 10.2.121.212:8006 (hostname verification failed)

pvecm add hostnameHostname works without problems.
Hello,
I found a solution to this issue. I was having a similar issue, where I was given error 500. It's actually due to a combination of things, which I'll bullet below:

  • Hostnames with the same name- so joining two nodes with the same hostname will not work. You must change one of the hostnames using the command 'nano /etc/hostname' and also use 'nano /etc/hosts' via command line. Feel free to copy and paste that, without the inclusion of quotes. You will use commands 'crtl + o' and 'ctrl + x' to save and close out of these files.
  • While trying to join a node to a cluster, the assisted set up will provide an IP address by default. This will be a gateway IP. You specifically want the IP address of the master node you're attempting to connect to, so use that instead. This is something that I was overlooking for a while, since it was filling it in for me.
After the above issues were fixed, I was able to add my second node to the cluster I had created. If you notice your newly added node's interface is now in a "hanging" state, refresh and re-login. As long as all nodes operate the same version of proxmox, you won't need to update ssh versions or any of that nonsense. Thank you for helping me get there!

Sources : https://pve.proxmox.com/wiki/Renaming_a_PVE_node
 
I have the same issue. I'm using the option 5.4.3. Adding Nodes with Separated Cluster Network. The hosts files are correct on all nodes, I can ping the hosts by hostname, and I can connect between them via SSH. What's wrong? I never had such problems with older versions of PVE.

Nodes are up-to-date with the newest version of PVE.

PS
After a few hours of trying to join one of my nodes to a freshly created cluster, I can say with full certainty that you need to work on more detailed messages. I changed everything that I could, and still can't join my node to a cluster. And the message

500 Can't connect to ip.ip.ip.ip:8006 (hostname verification failed)

tells me literally nothing, especially when I can ping my hosts via hostnames and also connect between them via SSH also by the hostnames.

I would like to make it clear that on the previous (older) versions (some about 5-6) I had no such problems having exactly the same base configuration of nodes as now.

In my opinion, there is something wrong with the pvecm command itself.
 
Last edited:
OK.

For everyone who has the same problem and can't solve it. The solution is to use FQDN instead of hostname or IP for adding the node to the cluster. I don't know why it works like that, don't ask me, but it worked for me at last.
 
guys, I've figured this out - it's because of Let's Encrypt SSL certificate installed on the cluster node thus you need to use covered server's hostname (in my case it's pve1.mydomain.com):

Bash:
root@pve6:~# pvecm add pve1.mydomain.com
Please enter superuser (root) password for 'pve1.mydomain.com': *******************
Establishing API connection with host 'pve1.mydomain.com'
Login succeeded.
check cluster join API version
No cluster network links passed explicitly, fallback to local node IP '192.168.200.246'
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1632474152.sql.gz'
waiting for quorum...OK
(re)generate node files
generate new node certificate
merge authorized SSH keys and known hosts
generated new node certificate, restart pveproxy and pvedaemon services
successfully added node 'pve6' to cluster.

hope someone will find it useful ;)
This was the solution for me; I am also using Let's Encrypt SSL certifiates.
 
Hello,

I have added the 3 lines below to /etc/hosts

Like this:

nano /etc/hosts

192.168.200.201 pve1
192.168.200.202 pve2
192.168.200.203 pve3

Save

I have restarted the 3 proxmox servers (pve)

I joined the 2nd and 3rd pve to the first one where I created the cluster.

Then it worked.

If this helps anyone!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!