Proxmox 4.4 Live Migration Problem

MrCrankHank

Member
Dec 5, 2014
11
0
21
Hi everybody,

we are currently running 3 node Proxmox 4.4 Cluster with Ceph and some shared Storage from a Linux Cluster. We recently migrated from an old two node proxmox 3 setup. While upgrading the cluster we changed the networking from the classic linux bridging to openvswitch. That's my guess where the problems come from.

In some cases when i live migrate a vm the task fails:

Jun 16 16:02:45 use dedicated network address for sending migration traffic (172.20.242.1)
Jun 16 16:02:46 starting migration of VM 104 to node 'kvm-c01-node01' (172.20.242.1)
Jun 16 16:02:46 copying disk images
Jun 16 16:02:46 starting VM 104 on remote node 'kvm-c01-node01'
Jun 16 16:02:48 starting online/live migration on tcp:172.20.242.1:60000
Jun 16 16:02:48 migrate_set_speed: 8589934592
Jun 16 16:02:48 migrate_set_downtime: 0.1
Jun 16 16:02:48 set migration_caps
Jun 16 16:02:48 set cachesize: 858993459
Jun 16 16:02:48 start migrate command to tcp:172.20.242.1:60000
Jun 16 16:02:50 migration status: active (transferred 1164679278, remaining 7429926912), total 8607571968)
Jun 16 16:02:50 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:52 migration status: active (transferred 2163188335, remaining 6430842880), total 8607571968)
Jun 16 16:02:52 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:54 migration status: active (transferred 2858173500, remaining 5734805504), total 8607571968)
Jun 16 16:02:54 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:56 migration status: active (transferred 3712890610, remaining 4877414400), total 8607571968)
Jun 16 16:02:56 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:58 migration status: active (transferred 4242548907, remaining 4346986496), total 8607571968)
Jun 16 16:02:58 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:00 migration status: active (transferred 4824522737, remaining 3763507200), total 8607571968)
Jun 16 16:03:00 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:02 migration status: active (transferred 5276740515, remaining 3309481984), total 8607571968)
Jun 16 16:03:02 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:04 migration status: active (transferred 5725033511, remaining 2858123264), total 8607571968)
Jun 16 16:03:04 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:06 migration status: active (transferred 6199944439, remaining 2378608640), total 8607571968)
Jun 16 16:03:06 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:08 migration status: active (transferred 6637811664, remaining 1930936320), total 8607571968)
Jun 16 16:03:08 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:10 migration status: active (transferred 7191819020, remaining 1370603520), total 8607571968)
Jun 16 16:03:10 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:12 migration status: active (transferred 7642999873, remaining 915116032), total 8607571968)
Jun 16 16:03:12 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:14 migration status: active (transferred 8053261811, remaining 482758656), total 8607571968)
Jun 16 16:03:14 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8115832973, remaining 414023680), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8174519430, remaining 350216192), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8232968296, remaining 280850432), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8308933851, remaining 204828672), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8370092086, remaining 143613952), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8430139532, remaining 82874368), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8497598965, remaining 479371264), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 951 overflow 0
Jun 16 16:03:17 migration status: active (transferred 8642721548, remaining 334077952), total 8607571968)
Jun 16 16:03:17 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 36312 overflow 0
Jun 16 16:03:17 migration status: active (transferred 8866168881, remaining 108789760), total 8607571968)
Jun 16 16:03:17 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 90757 overflow 0
Jun 16 16:03:17 migration speed: 282.48 MB/s - downtime 93 ms
Jun 16 16:03:17 migration status: completed
Jun 16 16:03:18 ERROR: VM 104 not running
Jun 16 16:03:18 ERROR: command '/usr/bin/ssh -o 'BatchMode=yes' root@172.20.242.1 qm resume 104 --skiplock --nocheck' failed: exit code 2
Jun 16 16:03:21 ERROR: migration finished with problems (duration 00:00:36)
TASK ERROR: migration problems

Jun 16 16:02:46 kvm-c01-node01 qm[3039]: <root@pam> starting task UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam:
Jun 16 16:02:46 kvm-c01-node01 qm[3064]: start VM 104: UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam:
Jun 16 16:02:47 kvm-c01-node01 systemd[1]: Starting 104.scope.
Jun 16 16:02:47 kvm-c01-node01 systemd[1]: Started 104.scope.
Jun 16 16:02:48 kvm-c01-node01 kernel: [873233.958601] device tap104i0 entered promiscuous mode
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named tap104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-port vmbr0 tap104i0 tag=10
Jun 16 16:02:48 kvm-c01-node01 qm[3039]: <root@pam> end task UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam: OK

I also saved logs from another VM with the identical problems. I can provide it on request. The error is not exactly reproducable. Sometimes live migration just works. I already checked if it is storage related, but it happens if the storage is on ceph or shared from the backend cluster.

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

Unfortunately i couldn't find anything around the web, so if anybody has an idea please let me know :)

Thanks for reading and have good day!
 
Hi everybody,

we are currently running 3 node Proxmox 4.4 Cluster with Ceph and some shared Storage from a Linux Cluster. We recently migrated from an old two node proxmox 3 setup. While upgrading the cluster we changed the networking from the classic linux bridging to openvswitch. That's my guess where the problems come from.

In some cases when i live migrate a vm the task fails:

Jun 16 16:02:45 use dedicated network address for sending migration traffic (172.20.242.1)
Jun 16 16:02:46 starting migration of VM 104 to node 'kvm-c01-node01' (172.20.242.1)
Jun 16 16:02:46 copying disk images
Jun 16 16:02:46 starting VM 104 on remote node 'kvm-c01-node01'
Jun 16 16:02:48 starting online/live migration on tcp:172.20.242.1:60000
Jun 16 16:02:48 migrate_set_speed: 8589934592
Jun 16 16:02:48 migrate_set_downtime: 0.1
Jun 16 16:02:48 set migration_caps
Jun 16 16:02:48 set cachesize: 858993459
Jun 16 16:02:48 start migrate command to tcp:172.20.242.1:60000
Jun 16 16:02:50 migration status: active (transferred 1164679278, remaining 7429926912), total 8607571968)
Jun 16 16:02:50 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:52 migration status: active (transferred 2163188335, remaining 6430842880), total 8607571968)
Jun 16 16:02:52 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:54 migration status: active (transferred 2858173500, remaining 5734805504), total 8607571968)
Jun 16 16:02:54 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:56 migration status: active (transferred 3712890610, remaining 4877414400), total 8607571968)
Jun 16 16:02:56 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:02:58 migration status: active (transferred 4242548907, remaining 4346986496), total 8607571968)
Jun 16 16:02:58 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:00 migration status: active (transferred 4824522737, remaining 3763507200), total 8607571968)
Jun 16 16:03:00 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:02 migration status: active (transferred 5276740515, remaining 3309481984), total 8607571968)
Jun 16 16:03:02 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:04 migration status: active (transferred 5725033511, remaining 2858123264), total 8607571968)
Jun 16 16:03:04 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:06 migration status: active (transferred 6199944439, remaining 2378608640), total 8607571968)
Jun 16 16:03:06 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:08 migration status: active (transferred 6637811664, remaining 1930936320), total 8607571968)
Jun 16 16:03:08 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:10 migration status: active (transferred 7191819020, remaining 1370603520), total 8607571968)
Jun 16 16:03:10 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:12 migration status: active (transferred 7642999873, remaining 915116032), total 8607571968)
Jun 16 16:03:12 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:14 migration status: active (transferred 8053261811, remaining 482758656), total 8607571968)
Jun 16 16:03:14 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8115832973, remaining 414023680), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8174519430, remaining 350216192), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:15 migration status: active (transferred 8232968296, remaining 280850432), total 8607571968)
Jun 16 16:03:15 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8308933851, remaining 204828672), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8370092086, remaining 143613952), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8430139532, remaining 82874368), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 0 overflow 0
Jun 16 16:03:16 migration status: active (transferred 8497598965, remaining 479371264), total 8607571968)
Jun 16 16:03:16 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 951 overflow 0
Jun 16 16:03:17 migration status: active (transferred 8642721548, remaining 334077952), total 8607571968)
Jun 16 16:03:17 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 36312 overflow 0
Jun 16 16:03:17 migration status: active (transferred 8866168881, remaining 108789760), total 8607571968)
Jun 16 16:03:17 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 90757 overflow 0
Jun 16 16:03:17 migration speed: 282.48 MB/s - downtime 93 ms
Jun 16 16:03:17 migration status: completed
Jun 16 16:03:18 ERROR: VM 104 not running
Jun 16 16:03:18 ERROR: command '/usr/bin/ssh -o 'BatchMode=yes' root@172.20.242.1 qm resume 104 --skiplock --nocheck' failed: exit code 2
Jun 16 16:03:21 ERROR: migration finished with problems (duration 00:00:36)
TASK ERROR: migration problems

Jun 16 16:02:46 kvm-c01-node01 qm[3039]: <root@pam> starting task UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam:
Jun 16 16:02:46 kvm-c01-node01 qm[3064]: start VM 104: UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam:
Jun 16 16:02:47 kvm-c01-node01 systemd[1]: Starting 104.scope.
Jun 16 16:02:47 kvm-c01-node01 systemd[1]: Started 104.scope.
Jun 16 16:02:48 kvm-c01-node01 kernel: [873233.958601] device tap104i0 entered promiscuous mode
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named tap104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln104i0
Jun 16 16:02:48 kvm-c01-node01 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-port vmbr0 tap104i0 tag=10
Jun 16 16:02:48 kvm-c01-node01 qm[3039]: <root@pam> end task UPID:kvm-c01-node01:00000BF8:05346077:5943E506:qmstart:104:root@pam: OK

I also saved logs from another VM with the identical problems. I can provide it on request. The error is not exactly reproducable. Sometimes live migration just works. I already checked if it is storage related, but it happens if the storage is on ceph or shared from the backend cluster.

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

Unfortunately i couldn't find anything around the web, so if anybody has an idea please let me know :)

Thanks for reading and have good day!
Hi,
are the Versions on both Nodes equal?
How looks the vm-config from 104?

Udo
 
Hi,

here are the server versions, as far is i know they are identical accross all nodes.

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

bootdisk: virtio0
cores: 6
cpu: host
ide2: none,media=cdrom
memory: 8192
name: <hostname>
net0: virtio=2E:91:F8:75:19:C8,bridge=vmbr0,tag=10
numa: 0
ostype: win7
smbios1: uuid=bac0407a-9887-4643-a401-b64248ccde71
sockets: 1
virtio0: sas_10k:vm-104-disk-1,size=80G

Thanks for you're help.
 
Hi,

here are the server versions, as far is i know they are identical accross all nodes.

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

proxmox-ve: 4.4-88 (running kernel: 4.4.62-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.62-1-pve: 4.4.62-88
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-50
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-95
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-100
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
openvswitch-switch: 2.6.0-2
ceph: 10.2.7-1~bpo80+1

bootdisk: virtio0
cores: 6
cpu: host
ide2: none,media=cdrom
memory: 8192
name: <hostname>
net0: virtio=2E:91:F8:75:19:C8,bridge=vmbr0,tag=10
numa: 0
ostype: win7
smbios1: uuid=bac0407a-9887-4643-a401-b64248ccde71
sockets: 1
virtio0: sas_10k:vm-104-disk-1,size=80G

Thanks for you're help.
Hi,
you use cpu=host. Is the cpu identcally on both nodes?
Otherway try kvm64 and/or enable Numa.

Udo
 
Hi!

Yep. Every host is a HP DL 380p Gen8 with two Intel Xeon E5-2640 CPUs. Node01 and Node02 used to be a two node cluster with proxmox 3 and i used live migration there for years without a problem. I'm kind of lost here. I will just make some tests with kvm64 cpu type.

Thanks.
 
Hi!

Yep. Every host is a HP DL 380p Gen8 with two Intel Xeon E5-2640 CPUs. Node01 and Node02 used to be a two node cluster with proxmox 3 and i used live migration there for years without a problem. I'm kind of lost here. I will just make some tests with kvm64 cpu type.

Thanks.
Hi,
perhaps different cpu related bios settings (bios version the same?)

Udo
 
Hi!

Thanks for your input, it was indeed settings related. Node01 used hyperthreading while the other two nodes didn't. I enabled HT on node02 and live migration between node01 and 02 works as expected. Gonna enable it on node03 this evening.

Thanks again and have a nice day.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!