Migration issue after replacing a node

hotwired007

Member
Sep 19, 2011
533
7
16
UK
Hi guys,

i've replaced one of my servers in my cluster - node1 - with a better physical machine, reused the IP and cabling but given it a new name (i have done this before with no issues) - no error messages when adding it to the cluster - but now after trying to migrate a VM to it, its now complaining about an RSA key:

Code:
May 30 11:14:02 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@192.168.15.31 /bin/true
May 30 11:14:02 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
May 30 11:14:02 @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
May 30 11:14:02 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
May 30 11:14:02 IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
May 30 11:14:02 Someone could be eavesdropping on you right now (man-in-the-middle attack)!
May 30 11:14:02 It is also possible that the RSA host key has just been changed.
May 30 11:14:02 The fingerprint for the RSA key sent by the remote host is
May 30 11:14:02 ae:ac:17:56:01:7f:5f:fc:db:34:b4:4e:f5:75:e6:b9.
May 30 11:14:02 Please contact your system administrator.
May 30 11:14:02 Add correct host key in /root/.ssh/known_hosts to get rid of this message.
May 30 11:14:02 Offending key in /root/.ssh/known_hosts:1
May 30 11:14:02 RSA host key for 192.168.15.31 has changed and you have requested strict checking.
May 30 11:14:02 Host key verification failed.
May 30 11:14:02 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted

any suggestions on what to do?
 
Hi guys,

i've replaced one of my servers in my cluster - node1 - with a better physical machine, reused the IP and cabling but given it a new name (i have done this before with no issues) - no error messages when adding it to the cluster - but now after trying to migrate a VM to it, its now complaining about an RSA key:

Code:
May 30 11:14:02 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@192.168.15.31 /bin/true
May 30 11:14:02 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
May 30 11:14:02 @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
May 30 11:14:02 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
May 30 11:14:02 IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
May 30 11:14:02 Someone could be eavesdropping on you right now (man-in-the-middle attack)!
May 30 11:14:02 It is also possible that the RSA host key has just been changed.
May 30 11:14:02 The fingerprint for the RSA key sent by the remote host is
May 30 11:14:02 ae:ac:17:56:01:7f:5f:fc:db:34:b4:4e:f5:75:e6:b9.
May 30 11:14:02 Please contact your system administrator.
May 30 11:14:02 Add correct host key in /root/.ssh/known_hosts to get rid of this message.
May 30 11:14:02 Offending key in /root/.ssh/known_hosts:1
May 30 11:14:02 RSA host key for 192.168.15.31 has changed and you have requested strict checking.
May 30 11:14:02 Host key verification failed.
May 30 11:14:02 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted

any suggestions on what to do?

remove the line 1 in /root/.ssh/known_hosts




 
each of my nodes only has one line of code in /root/.ssh/known_hosts

but there is a lot of keys in /etc/ssh/ssh_known_hosts
 
each of my nodes only has one line of code in /root/.ssh/known_hosts

but there is a lot of keys in /etc/ssh/ssh_known_hosts

just to note, this is only symlink to /etc/pve/priv/known_hosts

how did you add the node again? if you use the -force option, keys should be replaced accordingly.
 
I shutdown and removed my Dell powerEdge 860 node (node1) and then removed it from the cluster. (pvecm delnode node1)

Added my Dell PowerEdge 1950 server, fresh install of proxmox 2.1 update/dist-upgrade (locked down to pve2.1 in /etc/apt/sources.list, used the same IP address for excalibur (new name -same IP) pvecm add 192.168.15.33 (node2 IP in cluster)

i did this with my production cluster a few weeks ago and didnt have any of these issues - so im very confused!
 
as root on each server where this operation is needed:
1) vi /root/.ssh/known_hosts
2) Delete first line by pressing the letter d twice on the keyboard
3) Enter : (Press key on keyboard so that the sign : is shown bottom left)
4) enter the letter w and q after the : sign and press enter
5) vi is closed and the file is saved without line 1

 
as root on each server where this operation is needed:
1) vi /root/.ssh/known_hosts
2) Delete first line by pressing the letter d twice on the keyboard
3) Enter : (Press key on keyboard so that the sign : is shown bottom left)
4) enter the letter w and q after the : sign and press enter
5) vi is closed and the file is saved without line 1

thanks but i prefer nano for editing :D

i meant i dont know how to use SSH to connect from node to node - i use putty from my Windows PC to administer each Proxmox node ;)
 
I shutdown and removed my Dell powerEdge 860 node (node1) and then removed it from the cluster. (pvecm delnode node1)

Added my Dell PowerEdge 1950 server, fresh install of proxmox 2.1 update/dist-upgrade (locked down to pve2.1 in /etc/apt/sources.list, used the same IP address for excalibur (new name -same IP) pvecm add 192.168.15.33 (node2 IP in cluster)

i did this with my production cluster a few weeks ago and didnt have any of these issues - so im very confused!

in your case, you should user the -force option.

Code:
pvecm add 192.168.15.33 -force
 
do i need to readd it then?

i get this error:

root@excalibur:~# pvecm add 192.168.15.33 -force
unable to copy ssh ID

should i reinstall it and re-add it then?
 
try it. it should merge the known_hosts file. (post the log output)
 
  • Like
Reactions: Adam Smith
login to each node an clean the file /root/.ssh/known_hosts

From a proxmox host you can jump to another using the command ssh:
ssh IP. eg ssh 123.321.231.2

After you have cleaned the file then enter the command exit to return back.

After this install the key on the remote host like this:
ssh-copy-id
123.321.231.2 (if you are asked for a password it is the password for root)
 
login to each node an clean the file /root/.ssh/known_hosts

From a proxmox host you can jump to another using the command ssh:
ssh IP. eg ssh 123.321.231.2

After you have cleaned the file then enter the command exit to return back.

After this install the key on the remote host like this:
ssh-copy-id
123.321.231.2 (if you are asked for a password it is the password for root)

let me just confirm this:

Excalibur is new node (192.168.15.31)
Node2 (192.168.15.33), Node3 (192.168.15.35) and Node4 (192.168.15.37) are existing nodes in cluster.

I need to connect to Node2, Node3 and Node4 - erase the contents of the /root/.ssh/known_hosts file

connect to Node2, Node3 and Node4 from excalibur running the ssh-copy_id command?
 
everything is running fine after i ran

mv /root/.ssh/known_hosts /root/.ssh/known_hosts_old

and then connected from each node -

thanks for your help!

Now to replace Node2 and Node3 and remove Node4
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!