VM live migration 4.4 > 5 works only with VMs in HA group

DirkH

Renowned Member
Aug 18, 2015
13
2
68
Hi,

I just upgraded one of our clusters in place from 4.4 to 5.1. It's a 3 host cluster with ceph as storage backend.
So far all went smooth.

The problem I stumbled upon was that some VMs (all linux kvm, all had vga set to cirrus) didn't live migrate.
This cluster has 18VMs. 13 of these are in a HA-group, 5 aren't. The live migration from a 4.4 Host to a 5.1 Host worked without problem for all 13 Hosts of the HA-group, but none of the 5 hosts that aren't members there worked.
The error message didn't give a reason why the live migration faulted.

Code:
Apr 12 10:33:33 starting migration of VM 111 to node 'proxdmz01' (10.10.254.50)
Apr 12 10:33:33 copying disk images
Apr 12 10:33:33 starting VM 111 on remote node 'proxdmz01'
Apr 12 10:33:37 start remote tunnel
Apr 12 10:33:38 starting online/live migration on unix:/run/qemu-server/111.migrate
Apr 12 10:33:38 migrate_set_speed: 8589934592
Apr 12 10:33:38 migrate_set_downtime: 0.1
Apr 12 10:33:38 set migration_caps
Apr 12 10:33:38 set cachesize: 214748364
Apr 12 10:33:38 start migrate command to unix:/run/qemu-server/111.migrate
Apr 12 10:33:40 migration status error: failed
Apr 12 10:33:40 ERROR: online migrate failure - aborting
Apr 12 10:33:40 aborting phase 2 - cleanup resources
Apr 12 10:33:40 migrate_cancel
Apr 12 10:33:43 ERROR: migration finished with problems (duration 00:00:11)
TASK ERROR: migration problems

Shutting down the machines and migrate them offline worked.

We are planning to migrate two additional 5-Host clusters with far more machines where no HA-groups are defined at all. So it would be ugly if we couldn't migrate them live.

Does anyone has an idea what could be the problem?

I noticed a similar problem in this post, but as our is more specific and that problem is claimed solved I made a new thread.
https://forum.proxmox.com/threads/p...grade-vm-migration-problem.38117/#post-188106

Best Regards
Dirk
 
can you post the pveversion of an old an a new host, and also the content of the vm config?
 
Hi Dominik,

old was:
Code:
proxmox-ve: 4.4-108 (running kernel: 4.4.114-1-pve)
pve-manager: 4.4-22 (running version: 4.4-22/2728f613)
pve-kernel-4.4.98-2-pve: 4.4.98-101
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.4.13-1-pve: 4.4.13-56
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.98-3-pve: 4.4.98-103
pve-kernel-4.4.8-1-pve: 4.4.8-52
pve-kernel-4.4.13-2-pve: 4.4.13-58
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.98-4-pve: 4.4.98-104
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.16-1-pve: 4.4.16-64
pve-kernel-4.4.98-5-pve: 4.4.98-105
pve-kernel-4.4.67-1-pve: 4.4.67-92
pve-kernel-4.4.98-6-pve: 4.4.98-107
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.10-1-pve: 4.4.10-54
pve-kernel-4.4.76-1-pve: 4.4.76-94
pve-kernel-4.4.114-1-pve: 4.4.114-108
pve-kernel-4.4.83-1-pve: 4.4.83-96
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-54
qemu-server: 4.0-115
pve-firmware: 1.1-11
libpve-common-perl: 4.0-96
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.9.1-9~pve4
pve-container: 1.0-105
pve-firewall: 2.0-33
pve-ha-manager: 1.0-41
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
ceph: 12.2.4-1~bpo80+1

new is
Code:
proxmox-ve: 5.1-42 (running kernel: 4.13.16-1-pve)
pve-manager: 5.1-46 (running version: 5.1-46/ae8241d4)
pve-kernel-4.13: 5.1-43
pve-kernel-4.13.16-1-pve: 4.13.16-43
pve-kernel-4.4.114-1-pve: 4.4.114-108
pve-kernel-4.4.98-6-pve: 4.4.98-107
pve-kernel-4.4.98-5-pve: 4.4.98-105
pve-kernel-4.4.98-4-pve: 4.4.98-104
pve-kernel-4.4.98-3-pve: 4.4.98-103
pve-kernel-4.4.98-2-pve: 4.4.98-101
pve-kernel-4.4.95-1-pve: 4.4.95-99
pve-kernel-4.4.83-1-pve: 4.4.83-96
pve-kernel-4.4.76-1-pve: 4.4.76-94
pve-kernel-4.4.67-1-pve: 4.4.67-92
pve-kernel-4.4.62-1-pve: 4.4.62-88
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.40-1-pve: 4.4.40-82
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.21-1-pve: 4.4.21-71
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.16-1-pve: 4.4.16-64
pve-kernel-4.4.13-2-pve: 4.4.13-58
pve-kernel-4.4.13-1-pve: 4.4.13-56
pve-kernel-4.4.10-1-pve: 4.4.10-54
pve-kernel-4.4.8-1-pve: 4.4.8-52
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.2.8-1-pve: 4.2.8-41
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.2-1-pve: 4.2.2-16
ceph: 12.2.4-pve1
corosync: 2.4.2-pve3
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-common-perl: 5.0-28
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-17
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 2.1.1-3
lxcfs: 2.0.8-2
novnc-pve: 0.6-4
proxmox-widget-toolkit: 1.0-11
pve-cluster: 5.0-20
pve-container: 2.0-19
pve-docs: 5.1-16
pve-firewall: 3.0-5
pve-firmware: 2.0-4
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.9.1-9
pve-xtermjs: 1.0-2
qemu-server: 5.0-22
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.6-pve1~bpo9

thank you
Dirk
 
ups sorry

a not hot migrateable VM:
Code:
bootdisk: scsi0
cores: 1
ide2: none,media=cdrom
memory: 2048
name: exch
net0: virtio=C2:31:F1:1E:71:7C,bridge=vmbr1
numa: 0
ostype: l26
scsi0: ceph1:vm-111-disk-1,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=02b737a7-0203-4cc1-89b1-4889abc513ff
sockets: 1
vga: cirrus

vs. a HA-group member
Code:
bootdisk: virtio0
cores: 1
ide2: none,media=cdrom
memory: 2048
name: colada
net0: virtio=32:39:62:38:39:62,bridge=vmbr1
numa: 0
ostype: l26
smbios1: uuid=7050782e-e036-4a68-b2ab-e9d0744631f3
sockets: 1
vga: cirrus
virtio0: ceph1:vm-101-disk-1,size=5G