Migration problems

Shad

New Member
Jun 27, 2016
7
0
1
51
A little background.. This 13 host cluster just fell in my lap when the previous owner left the company.. I have some experience with Linux, but it's very superficial. A few weeks back, we had a hardware failure on two of our nodes. I have resolved the hardware issue. I had to reinstall proxmox from scratch on these hosts. I was successful in getting the hosts re-installed and joined back into the cluster following some spotty documentation that was left behind. Now to the problem at hand..

I can live migrate TO them, but not FROM them. I can live migrate back and forth between the two, but if I try to migrate to another host in the cluster, the fail with the following error:

task started by HA resource agent
Jun 27 11:48:29 starting migration of VM 100 to node 'virthost04' (10.x.x.114)
Jun 27 11:48:29 copying disk images
Jun 27 11:48:29 starting VM 100 on remote node 'virthost04'
Jun 27 11:48:31 starting ssh migration tunnel
Jun 27 11:48:31 starting online/live migration on localhost:60000
Jun 27 11:48:31 migrate_set_speed: 8589934592
Jun 27 11:48:31 migrate_set_downtime: 0.1
Jun 27 11:48:33 ERROR: online migrate failure - aborting
Jun 27 11:48:33 aborting phase 2 - cleanup resources
Jun 27 11:48:33 migrate_cancel
Jun 27 11:48:34 ERROR: migration finished with problems (duration 00:00:05)
TASK ERROR: migration problems
I have googled and read ad nauseum, but I can't find anything that might be causing this. First of all, can you point me to where I might look to resolve the problem. I need to do a bunch of migrations on the rest of the hosts so that I can do a bios upgrade to resolve the hardware issue on the rest of the hosts.

Many thanks in advance.
 
What CPU setting do you use for the VM? You should use CPU type 'kvm64' (default) for reliable migration between hosts with different CPU type...
 
They are all using kvm64 as the CPU type. 12 of the 13 hosts are identical hardware. Really, I can't get things to migrate consistently between any of the hosts. Frustrating to say the least. In order get things done, I'm having to shut down VMs to migrate them. Are there more detailed logs anywhere that would give me an idea as to what is going on?
 
are all your hosts up to date?
you can check your versions with
Code:
pveversion -v
 
pveversion -v
We're on 4.1.1

Reinstalled host:

root@virthost06:~# pveversion -v
proxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-1 (running version: 4.1-1/2f9650d4)
pve-kernel-4.2.6-1-pve: 4.2.6-26
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-29
qemu-server: 4.0-41
pve-firmware: 1.1-7
libpve-common-perl: 4.0-41
libpve-access-control: 4.0-10
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-17
pve-container: 1.0-32
pve-firewall: 2.0-14
pve-ha-manager: 1.0-14
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u2
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve6~jessie
openvswitch-switch: 2.3.0+git20140819-3+deb8u1

Original untouched host:

proxmox-ve: 4.1-26 (running kernel: 4.2.6-1-pve)
pve-manager: 4.1-1 (running version: 4.1-1/2f9650d4)
pve-kernel-4.2.6-1-pve: 4.2.6-26
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 0.17.2-1
pve-cluster: 4.0-29
qemu-server: 4.0-41
pve-firmware: 1.1-7
libpve-common-perl: 4.0-41
libpve-access-control: 4.0-10
libpve-storage-perl: 4.0-38
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.4-17
pve-container: 1.0-32
pve-firewall: 2.0-14
pve-ha-manager: 1.0-14
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-5
lxcfs: 0.13-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve6~jessie
openvswitch-switch: 2.3.2-2

 
do you maybe have a firewall rule which prevents a connection from the two hosts?
 
not unless proxmox installs one straight out of the box. These hosts are in the same chassis on the same network... What log files can I look at to see what's going wrong? Is there a way to turn on more detailed logging?
 
A little background.. This 13 host cluster just fell in my lap when the previous owner left the company.. I have some experience with Linux, but it's very superficial. A few weeks back, we had a hardware failure on two of our nodes. I have resolved the hardware issue. I had to reinstall proxmox from scratch on these hosts. I was successful in getting the hosts re-installed and joined back into the cluster following some spotty documentation that was left behind. Now to the problem at hand..

I can live migrate TO them, but not FROM them. I can live migrate back and forth between the two, but if I try to migrate to another host in the cluster, the fail with the following error:

task started by HA resource agent
Jun 27 11:48:29 starting migration of VM 100 to node 'virthost04' (10.x.x.114)
Jun 27 11:48:29 copying disk images
Jun 27 11:48:29 starting VM 100 on remote node 'virthost04'
Jun 27 11:48:31 starting ssh migration tunnel
Jun 27 11:48:31 starting online/live migration on localhost:60000
Jun 27 11:48:31 migrate_set_speed: 8589934592
Jun 27 11:48:31 migrate_set_downtime: 0.1
Jun 27 11:48:33 ERROR: online migrate failure - aborting
Jun 27 11:48:33 aborting phase 2 - cleanup resources
Jun 27 11:48:33 migrate_cancel
Jun 27 11:48:34 ERROR: migration finished with problems (duration 00:00:05)
TASK ERROR: migration problems
I have googled and read ad nauseum, but I can't find anything that might be causing this. First of all, can you point me to where I might look to resolve the problem. I need to do a bunch of migrations on the rest of the hosts so that I can do a bios upgrade to resolve the hardware issue on the rest of the hosts.

Many thanks in advance.
Hi,
just an idea: is the MTU on the cluster-network on both nodes the same?

Udo
 
not specifically no, but in general: you should update your hosts to the newest version, but i do not know if it solves your problem
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!