[SOLVED] Can't connect to destination address using public key TASK ERROR: migration aborted

Mikepop

Well-Known Member
Feb 6, 2018
63
5
48
50
I've seen other post related to this issue but I cannot see any clear solution.
root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.
root@int102:~# ssh 10.10.10.103
Linux int103 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Mar 22 14:08:43 2018

I have quorum and everything works except migration. I'm confused with relations between /root/.ssh/know_hosts, /etc/pve/priv/known_host and /etc/ssh/ssh_known_hosts and how to sync them.

Regards
 
Hi,

Are you on latest package versions?

If so could you please run:
Code:
pvecm updatecerts
on both nodes?

If that does not help you additionally manually connect between the two nodes:
Code:
ssh -o "HostKeyAlias=NODENAME" root@NODEIP

Replace NODENAME/IP with the respective target node.

I'm confused with relations between /root/.ssh/know_hosts, /etc/pve/priv/known_host and /etc/ssh/ssh_known_hosts and how to sync them.

We track the known cluster nodes over our shared cluster filesystem.

So we link a few seemingly local configuration files into the cluster file system:
Code:
/etc/ssh/ssh_known_hosts -> /etc/pve/priv/known_hosts
/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys

The first above is for tracking if we connect to a legitimate node, i.e. no MITM or other tampering, we normally ensure that our own correct key gets synced there one cluster join, and cluster filesystem start (i.e. every boot or cluster update). The latter is to know who is allowed to access us via public key authorization.
It looks like in your case the first step somehow failed, and thus the nodes do not trust "know" each other on a SSH level, pvecm updatecerts should fix this.
(sorry if over explained, but maybe someone else finds this helpful too someday).

Edit:
Oh and we now use HostKeyAlias (node name, not it's IP), as you see in my proposed command above, to avoid running into problems if the nodes IP changes, or if a new network is added.
 
Thanks for the detailed answer Thomas, but pvecm updatecerts did not solved the issue, on all nodes:

root@int102:~# pvecm updatecerts
root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.

root@int102:~# ssh -o "HostKeyAlias=103" root@10.10.10.103
The authenticity of host '103 (10.10.10.103)' can't be established.
ECDSA key fingerprint is SHA256:OuSwK1+NwPw1XrL9la0MswUuvEvQGGAPmOFP0k/B1Vs.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '103' (ECDSA) to the list of known hosts.
Linux int103 4.13.13-6-pve #1 SMP PVE 4.13.13-42 (Fri, 9 Mar 2018 11:55:18 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Fri Mar 23 08:30:06 2018


root@int102:~# /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=int103' root@10.10.10.103 /bin/true
Host key verification failed.

Regards
 
What helped me in debugging and finding the file that is responsible for the rejection including the line number:
as root on the source machine for migration, do:
/usr/bin/ssh -v -v -v -e none -o 'HostKeyAlias=$hostname' root@$hostip
ssh will show you the offending file/line. once you remove that line everything should resolve by itself.
 
Hi,

Are you on latest package versions?

If so could you please run:
Code:
pvecm updatecerts
on both nodes?

If that does not help you additionally manually connect between the two nodes:
Code:
ssh -o "HostKeyAlias=NODENAME" root@NODEIP

Replace NODENAME/IP with the respective target node.



We track the known cluster nodes over our shared cluster filesystem.

So we link a few seemingly local configuration files into the cluster file system:
Code:
/etc/ssh/ssh_known_hosts -> /etc/pve/priv/known_hosts
/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys

The first above is for tracking if we connect to a legitimate node, i.e. no MITM or other tampering, we normally ensure that our own correct key gets synced there one cluster join, and cluster filesystem start (i.e. every boot or cluster update). The latter is to know who is allowed to access us via public key authorization.
It looks like in your case the first step somehow failed, and thus the nodes do not trust "know" each other on a SSH level, pvecm updatecerts should fix this.
(sorry if over explained, but maybe someone else finds this helpful too someday).

Edit:
Oh and we now use HostKeyAlias (node name, not it's IP), as you see in my proposed command above, to avoid running into problems if the nodes IP changes, or if a new network is added.
i am also having the same issue where i no longer can migrate

2023-07-23 16:43:14 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve-pinky' root@172.2.2.30 /bin/true 2023-07-23 16:43:14 root@172.2.2.30: Permission denied (publickey,password). 2023-07-23 16:43:14 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key TASK ERROR: migration aborted

when i run
ssh -o "HostKeyAlias=NODENAME" root@NODEIP
on both system, im required to enter the root password (only required one way)

obviously this is the issue where one node isnt able to ssh in to the other (password free).
not sure how to fix this though. ive "runpvecm updatecerts" on both systems

edit: i was able to solve the issue thanks to this post: https://forum.proxmox.com/threads/cannot-migrate-from-one-node-to-another.60431/

log into pve1
cd .ssh mv id_rsa id_rsa.old mv id_rsa.pub id_rsa.pub.old mv config config.old
log into pve2
cd .ssh mv id_rsa id_rsa.old mv id_rsa.pub id_rsa.pub.old mv config config.old pvecm updatecerts
back into pve1
pvecm updatecerts
 
Last edited:
  • Like
Reactions: alexdelprete
i was able to solve the issue thanks to this post

Thanks for your post. It worked except for a GUI problem (connection error when managing other nodes), so I had to add a service restart.

In case someone else has the same issue, I ran these commands on all my 3 nodes:

Code:
cd /root/.ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
pvecm updatecerts -f
systemctl restart pvedaemon pveproxy pve-cluster
 
Last edited:
Hello.
I read this thread and a lot of other ones, I have the same issue but I couldn't fix it.
I've 2 nodes, in a cluster, linked with a dedicated network (private). Suddenly, the migration became broken. It worked before and I don't know what I did to brake it.
I also tried to reboot the nodes, nothing is working.
Any idea ?
Thank's ;)
 
Hello.
I read this thread and a lot of other ones, I have the same issue but I couldn't fix it.
I've 2 nodes, in a cluster, linked with a dedicated network (private). Suddenly, the migration became broken. It worked before and I don't know what I did to brake it.
I also tried to reboot the nodes, nothing is working.
Any idea ?
Thank's ;)
Unless you provide new information to your use case. I can only recommend you shutdown both systems, then start both systems.

Follow the fix exactly as I laid out in a previous post

log into pve1

cd .ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old

log into pve2

cd .ssh
mv id_rsa id_rsa.old
mv id_rsa.pub id_rsa.pub.old
mv config config.old
pvecm updatecerts

back into pve1
pvecm updatecerts
 
Hello.
I just did again and still the issue.
I've a cluster of 2 pve. Linked with a dedicated network for the cluster and with a public network to access.
Everything was ok and suddenly came done. I don't know what I did wrong...
When I log in one of the both servers, I can see both machines in the cluster and access them well. It seems it's only migrating issue.

Here is the log :
024-01-27 16:07:04 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve' root@10.10.10.12 /bin/true
2024-01-27 16:07:04 Connection closed by 10.10.10.12 port 22
2024-01-27 16:07:04 ERROR: migration aborted (duration 00:02:00): Can't connect to destination address using public key
TASK ERROR: migration aborted

Please, tell me if you need more information.

Edit :
I just changed the migration setting to use the public network and it worked ! Maybe I've an issue with the network config for the private net ?
 
Last edited:
  • Like
Reactions: chipbreak

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!