[SOLVED] Can't connect to destination address using public key TASK ERROR: migration aborted

Mikepop

Well-Known Member
Feb 6, 2018
63
5
48
51
I've seen other post related to this issue but I cannot see any clear solution.
root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.
root@int102:~# ssh 10.10.10.103
Linux int103 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Mar 22 14:08:43 2018

I have quorum and everything works except migration. I'm confused with relations between /root/.ssh/know_hosts, /etc/pve/priv/known_host and /etc/ssh/ssh_known_hosts and how to sync them.

Regards
 
Hi,

Are you on latest package versions?

If so could you please run:
Code:
pvecm updatecerts
on both nodes?

If that does not help you additionally manually connect between the two nodes:
Code:
ssh -o "HostKeyAlias=NODENAME" root@NODEIP

Replace NODENAME/IP with the respective target node.

I'm confused with relations between /root/.ssh/know_hosts, /etc/pve/priv/known_host and /etc/ssh/ssh_known_hosts and how to sync them.

We track the known cluster nodes over our shared cluster filesystem.

So we link a few seemingly local configuration files into the cluster file system:
Code:
/etc/ssh/ssh_known_hosts -> /etc/pve/priv/known_hosts
/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys

The first above is for tracking if we connect to a legitimate node, i.e. no MITM or other tampering, we normally ensure that our own correct key gets synced there one cluster join, and cluster filesystem start (i.e. every boot or cluster update). The latter is to know who is allowed to access us via public key authorization.
It looks like in your case the first step somehow failed, and thus the nodes do not trust "know" each other on a SSH level, pvecm updatecerts should fix this.
(sorry if over explained, but maybe someone else finds this helpful too someday).

Edit:
Oh and we now use HostKeyAlias (node name, not it's IP), as you see in my proposed command above, to avoid running into problems if the nodes IP changes, or if a new network is added.
 
Thanks for the detailed answer Thomas, but pvecm updatecerts did not solved the issue, on all nodes:

root@int102:~# pvecm updatecerts
root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.

root@int102:~# ssh -o "HostKeyAlias=103" root@10.10.10.103
The authenticity of host '103 (10.10.10.103)' can't be established.
ECDSA key fingerprint is SHA256:OuSwK1+NwPw1XrL9la0MswUuvEvQGGAPmOFP0k/B1Vs.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '103' (ECDSA) to the list of known hosts.
Linux int103 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Fri Mar 23 08:30:06 2018


root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.

Regards
 
What helped me in debugging and finding the file that is responsible for the rejection including the line number:
as root on the source machine for migration, do:
/usr/bin/ssh -v -v -v -e none -o 'HostKeyAlias=$hostname' root@$hostip
ssh will show you the offending file/line. once you remove that line everything should resolve by itself.
 
Hi,

Are you on latest package versions?

If so could you please run:
Code:
pvecm updatecerts
on both nodes?

If that does not help you additionally manually connect between the two nodes:
Code:
ssh -o "HostKeyAlias=NODENAME" root@NODEIP

Replace NODENAME/IP with the respective target node.



We track the known cluster nodes over our shared cluster filesystem.

So we link a few seemingly local configuration files into the cluster file system:
Code:
/etc/ssh/ssh_known_hosts -> /etc/pve/priv/known_hosts
/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys

The first above is for tracking if we connect to a legitimate node, i.e. no MITM or other tampering, we normally ensure that our own correct key gets synced there one cluster join, and cluster filesystem start (i.e. every boot or cluster update). The latter is to know who is allowed to access us via public key authorization.
It looks like in your case the first step somehow failed, and thus the nodes do not trust "know" each other on a SSH level, pvecm updatecerts should fix this.
(sorry if over explained, but maybe someone else finds this helpful too someday).

Edit:
Oh and we now use HostKeyAlias (node name, not it's IP), as you see in my proposed command above, to avoid running into problems if the nodes IP changes, or if a new network is added.
i am also having the same issue where i no longer can migrate

2023-07-23 16:43:14 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve-pinky' root@172.2.2.30 /bin/true 2023-07-23 16:43:14 root@172.2.2.30: Permission denied (publickey,password). 2023-07-23 16:43:14 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key TASK ERROR: migration aborted

when i run
ssh -o "HostKeyAlias=NODENAME" root@NODEIP
on both system, im required to enter the root password (only required one way)

obviously this is the issue where one node isnt able to ssh in to the other (password free).
not sure how to fix this though. ive "runpvecm updatecerts" on both systems

edit: i was able to solve the issue thanks to this post: https://forum.proxmox.com/threads/cannot-migrate-from-one-node-to-another.60431/

log into pve1
cd .ssh mv id_rsa id_rsa.old mv id_rsa.pub id_rsa.pub.old mv config config.old
log into pve2
cd .ssh mv id_rsa id_rsa.old mv id_rsa.pub id_rsa.pub.old mv config config.old pvecm updatecerts
back into pve1
pvecm updatecerts
 
Last edited:
i was able to solve the issue thanks to this post

Thanks for your post. It worked except for a GUI problem (connection error when managing other nodes), so I had to add a service restart.

In case someone else has the same issue, I ran these commands on all my 3 nodes:

Code:
cd /root/.ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
pvecm updatecerts -f
systemctl restart pvedaemon pveproxy pve-cluster
 
Last edited:
Hello.
I read this thread and a lot of other ones, I have the same issue but I couldn't fix it.
I've 2 nodes, in a cluster, linked with a dedicated network (private). Suddenly, the migration became broken. It worked before and I don't know what I did to brake it.
I also tried to reboot the nodes, nothing is working.
Any idea ?
Thank's ;)
 
Hello.
I read this thread and a lot of other ones, I have the same issue but I couldn't fix it.
I've 2 nodes, in a cluster, linked with a dedicated network (private). Suddenly, the migration became broken. It worked before and I don't know what I did to brake it.
I also tried to reboot the nodes, nothing is working.
Any idea ?
Thank's ;)
Unless you provide new information to your use case. I can only recommend you shutdown both systems, then start both systems.

Follow the fix exactly as I laid out in a previous post

log into pve1

cd .ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old

log into pve2

cd .ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
pvecm updatecerts

back into pve1
pvecm updatecerts
 
Hello.
I just did again and still the issue.
I've a cluster of 2 pve. Linked with a dedicated network for the cluster and with a public network to access.
Everything was ok and suddenly came done. I don't know what I did wrong...
When I log in one of the both servers, I can see both machines in the cluster and access them well. It seems it's only migrating issue.

Here is the log :
024-01-27 16:07:04 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve' root@10.10.10.12 /bin/true
2024-01-27 16:07:04 Connection closed by 10.10.10.12 port 22
2024-01-27 16:07:04 ERROR: migration aborted (duration 00:02:00): Can't connect to destination address using public key
TASK ERROR: migration aborted

Please, tell me if you need more information.

Edit :
I just changed the migration setting to use the public network and it worked ! Maybe I've an issue with the network config for the private net ?
 
Last edited:
  • Like
Reactions: chipbreak
Thanks for your post. It worked except for a GUI problem (connection error when managing other nodes), so I had to add a service restart.

In case someone else has the same issue, I ran these commands on all my 3 nodes:

Code:
cd /root/.ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
pvecm updatecerts -f
systemctl restart pvedaemon pveproxy pve-cluster
Just ran into the same issue. I can confirm I had to run these commands on all my nodes to make it work seamlessly.

Step 1:

Bash:
# Run on all nodes:
cd /root/.ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old

Step 2:

Bash:
# Run on all nodes
pvecm updatecerts

I didn't have to restart my pvedaemon, pveprox, and pve-cluster services, but if you're still running into issues, might as well run:

Bash:
systemctl restart pvedaemon pveproxy pve-cluster
 
check your version every node .

I’m sure it is preferred, but is it critical that all nodes are on the exact same version for migration to function?

Or is it enough for the major versions to be the same?

I’m almost thinking it is more important to pin the kernel so the host Linux Kernel is the same across nodes.

But honestly I don’t know the answer to either.
 
I ran the following but was still have issues:
Step 1:

Bash:
# Run on all nodes:
cd /root/.ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
Step 2:

Bash:
# Run on all nodes
pvecm updatecerts

Until I ran this also from each node, connecting back to each node:
Code:
ssh -o "HostKeyAlias=NODENAME" root@NODEIP

For me it seems that the HostKeyAlias is not being set when the cluster was created and or when running pvecm updatecerts

For info:
The node I created the cluster from:
pve-manager/8.1.10/4b06efb5db453f29
Linux 6.5.13-5-pve

The other two nodes:
pve-manager/8.2.2/9355359cd7afbae4
Linux 6.8.4-3-pve <-although I am going to pin down to 6.5.13-3 for iGPU passthrough

My Nodes are setup for management on 192.168.2.0/24, and I have CoroSync running on 192.168.3.0/24. So I'm not sure if maybe my hostnames being on 192.168.2.0/24 in the /etc/hosts files had anything to do with the SSL stuff not getting setup correctly when I created the cluster in the GUI or not?
 
I ran the following but was still have issues:


Until I ran this also from each node, connecting back to each node:


For me it seems that the HostKeyAlias is not being set when the cluster was created and or when running pvecm updatecerts

For info:
The node I created the cluster from:
pve-manager/8.1.10/4b06efb5db453f29
Linux 6.5.13-5-pve

The other two nodes:
pve-manager/8.2.2/9355359cd7afbae4
Linux 6.8.4-3-pve <-although I am going to pin down to 6.5.13-3 for iGPU passthrough

My Nodes are setup for management on 192.168.2.0/24, and I have CoroSync running on 192.168.3.0/24. So I'm not sure if maybe my hostnames being on 192.168.2.0/24 in the /etc/hosts files had anything to do with the SSL stuff not getting setup correctly when I created the cluster in the GUI or not?

Well, maybe not so lucky. Containers seem to migrate after doing to the above, but when they start they are not happy, and warn against Host key verification failures.

Code:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
SHA256:AYgXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:4
  remove with:
  ssh-keygen -f "/root/.ssh/known_hosts" -R "192.168.2.20"
Host key for 192.168.2.20 has changed and you have requested strict checking.
Host key verification failed.

The only way I got a migrated container to start on other nodes was to run ssh-keygen -f "/root/.ssh/known_hosts" -R "192.168.2.20" on the original node, and then I ssh'd into each node, from each node on both the 192.168.2.0/24 and 192.168.3.0/24 networks.
 
  • Like
Reactions: avnkhanh
For what it's worth, I had this EXACT problem. Nothing seemed to help; updating the certs, checking hosts files, nothing at all helped. I could resolve names and ping but the moment I tried to ssh to test it would just hang.

when I added the -v switch to ssh, I saw a error that helped me find the cause. When I would migrate a VM between certain vms it was basically complaining about certificates not being able to be used to login during the migration.

I Dug into known hosts and a bunch of other stuff in the process.... turns out there is a sshd bug. Apparently, Jumbo frames apparently can cause the she key exchange to fail... crazy, huh?

Removed the "mtu 9000" from all my interfaces and it worked! (This was probably overkill, I probably could have removed them from the mainterface that proxmox communicates on)

Hope this helps someone and saves them the headache I was having/fighting!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!