Can not migrate VM to another server

itvietnam

Renowned Member
Aug 11, 2015
132
4
83
Hi,

We having migration problem recently, promox does not migrate VM to other server if 1 server die.

We tried manual migrate from hv103 to hv101 and got error:

Code:
()
Task viewer: VM 150 - Migrate
Output
Status
Stop
task started by HA resource agent
2017-12-18 19:13:12 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=hv101' root@10.10.30.151 /bin/true
2017-12-18 19:13:12 Host key verification failed.
2017-12-18 19:13:12 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
TASK ERROR: migration aborted

from hv103 we can ssh to hv101 without error:

Code:
root@hv103:~#
root@hv103:~# ping hv101
PING vhost-02-hv101 (10.10.30.151) 56(84) bytes of data.
64 bytes from vhost-02-hv101 (10.10.30.151): icmp_seq=1 ttl=64 time=0.119 ms
^C
--- vhost-02-hv101 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.119/0.119/0.119/0.000 ms
root@hv103:~# ssh hv101
Linux hv101 4.10.17-2-pve #1 SMP PVE 4.10.17-19 (Fri, 4 Aug 2017 13:34:37 +0200) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Dec 18 19:20:05 2017 from 10.10.30.153
root@hv101:~#

And our pveversion:
Code:
root@hv107:~#  for i in `seq 7`;do ssh hv10$i "pveversion";done
pve-manager/5.0-30/5ab26bc (running kernel: 4.10.17-2-pve)
pve-manager/5.0-30/5ab26bc (running kernel: 4.10.17-2-pve)
pve-manager/5.1-35/722cc488 (running kernel: 4.13.4-1-pve)
pve-manager/5.1-35/722cc488 (running kernel: 4.13.4-1-pve)
pve-manager/5.1-35/722cc488 (running kernel: 4.13.4-1-pve)
pve-manager/5.1-35/722cc488 (running kernel: 4.13.4-1-pve)
pve-manager/5.1-35/722cc488 (running kernel: 4.13.4-1-pve)
root@hv107:~#

May i know how to fix?

I have search all topic and can not get over this problem for these days.
 
what happens when you try
Code:
/usr/bin/ssh -o 'HostKeyAlias=hv101' root@10.10.30.151 /bin/true
?
 
Hi dcsapak,

It return as below:

Code:
root@hv103:~# /usr/bin/ssh -o 'HostKeyAlias=hv101' root@10.10.30.151 /bin/true
Warning: the RSA host key for 'hv101' differs from the key for the IP address '[10.10.30.151]:4848'
Offending key for IP in /root/.ssh/known_hosts:2
Matching host key in /etc/ssh/ssh_known_hosts:3
Are you sure you want to continue connecting (yes/no)?

After move .ssh/known_hosts to ~ i try again and it's ok now.

May i know the root cause lead to this issue so i can avoid in future.

Thanks,
 
this can happen in a number of cases:

you regenerated the ssh keys of hv101 (either reinstall, or dpkg-reconfigure openssh-server, etc..)
you connected to a different host with the same hostname
 
Hi,

I have the same problem, and tried the steps above.

Cannot migrate:
Code:
root@vdg-pve01-par6:~# qm migrate 104 vdg-pve02-par6 --online
2018-02-08 10:36:24 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=vdg-pve02-par6' root@10.1.246.2 /bin/true
2018-02-08 10:36:24 Host key verification failed.
2018-02-08 10:36:24 ERROR: migration aborted (duration 00:00:00): Can't connect to destination address using public key
migration aborted

Tried to ssh directly: working:
Code:
root@vdg-pve01-par6:~# ssh root@10.1.246.2
Linux vdg-pve02-par6 4.13.13-5-pve #1 SMP PVE 4.13.13-38 (Fri, 26 Jan 2018 10:47:09 +0100) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Feb  8 10:27:18 2018 from 10.1.246.1


Tried the command suggested by dcsapak:
Code:
root@vdg-pve02-par6:~#  /usr/bin/ssh -o 'HostKeyAlias=vdg-pve02-par6' root@10.1.246.2 /bin/true
root@vdg-pve02-par6:~#

Where should I search, now, for any bad host key (I reinstalled the 2nd node from scratch, and tried first to remove any references on /root/.ssh/ and /etc/ssh/ before) ?

Here is my pveversion -v:
Code:
proxmox-ve: 5.1-38 (running kernel: 4.13.13-5-pve)
pve-manager: 5.1-43 (running version: 5.1-43/bdb08029)
pve-kernel-4.13.13-2-pve: 4.13.13-33
pve-kernel-4.13.13-5-pve: 4.13.13-38
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-20
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-16
pve-qemu-kvm: 2.9.1-6
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.4-pve2~bpo9
 
Just adding each PVE server's IP address on /etc/hosts on both nodes was enough to solve the problem.
Unsure that it is the best solution, though ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!