Live Migration Fails - ProxMox VE 2.2

woblit · Nov 14, 2012

Hi All,

Hopefully you can help. Just installed 2 new Proxmox VE 2.2 servers. Using the E5 processors - each with 64GB memory. I have set them up talking to a Ceph cluster which I have been running with for sometime. I started with one box, which was using Ceph perfectly. VMs would create and run without a problem. I then added the second server into the cluster etc, and am able to create VMs on this server and migrate a running VM from the original server to the second one (same setup) but cannot migrate it back from the second server to the original one. If I stop the VM on the second server and choose an offline migration this works without a problem. The VM will run on the second server without a problem, so I know that my shared storage (Ceph) is working perfectly from both systems.

These are the errors I get from the migration process:

Nov 14 15:07:12 starting migration of VM 101 to node 'ihv1' (192.168.0.1)
Nov 14 15:07:12 copying disk images
Nov 14 15:07:12 starting VM 101 on remote node 'ihv1'
Nov 14 15:07:13 ERROR: online migrate failure - command '/usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@192.168.0.1 qm start 101 --stateuri tcp --skiplock --migratedfrom ihv2' failed: exit code 255
Nov 14 15:07:13 aborting phase 2 - cleanup resources
Nov 14 15:07:13 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems

The other error I get when the original server tries to start the machine is:

TASK ERROR: start failed: command '/usr/bin/kvm -id 101 -chardev 'socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/101.vnc,x509,password -pidfile /var/run/qemu-server/101.pid -daemonize -name test.tester -smp 'sockets=1,cores=4' -cpu host -nodefaults -boot 'menu=on' -vga cirrus -k en-gb -m 768 -cpuunits 1000 -usbdevice tablet -drive 'if=none,id=drive-ide2,media=cdrom,aio=native' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' -drive 'file=rbd:rbd/vm-101-disk-1:id=admin:auth_supported=cephx\;none:keyring=/etc/pve/priv/ceph/CloudFlex.keyring:mon_host=192.168.10.10\:6789,if=none,id=drive-sata0,cache=writethrough,aio=native' -device 'ide-drive,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' -netdev 'type=user,id=net0,hostname=test.tester' -device 'rtl8139,mac=6E:0C:BE:2B:42:6F,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -incoming tcp:localhost:60000 -S' failed: exit code 1

Hopefully someone can offer advice. Had a working 2.1 cluster before which seemed to be perfect.

Warren.

dietmar · Nov 14, 2012

Do not use cpu=host

woblit · Nov 14, 2012

dietmar said:
Do not use cpu=host

Hi Dietmar,

I tried that - and unfortunately gives the same error. Can do an online migration from Node 1 to Node 2. But then when trying to migrate back to Node 1 it comes up with the error as shown.

Looking forward to hearing back from you.

Warren.

Angelo Pantano · Nov 16, 2012

same problem here, when I try to migrate I get:

Nov 16 12:44:34 starting migration of VM 114 to node 'fmckvm100' (10.12.4.100)
Nov 16 12:44:34 copying disk images
Nov 16 12:44:34 starting VM 114 on remote node 'fmckvm100'
Nov 16 12:44:36 starting migration tunnel
Nov 16 12:44:37 starting online/live migration on port 60000
Nov 16 12:44:39 ERROR: online migrate failure - aborting
Nov 16 12:44:39 aborting phase 2 - cleanup resources
Nov 16 12:44:40 ERROR: migration finished with problems (duration 00:00:06)
TASK ERROR: migration problems

root@fmckvm100:~# pveversion --verbose
pve-manager: 2.2-30 (pve-manager/2.2/d3818aa7)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-32
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

root@proxmox104:~# pveversion --verbose
pve-manager: 2.2-30 (pve-manager/2.2/d3818aa7)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-32
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

what could possibly be the reason?

spirit · Nov 16, 2012

Hi,
do you have tried with virtio disk instead sata ?

woblit · Nov 16, 2012

Angelo Pantano said:
same problem here, when I try to migrate I get:

Nov 16 12:44:34 starting migration of VM 114 to node 'fmckvm100' (10.12.4.100)
Nov 16 12:44:34 copying disk images
Nov 16 12:44:34 starting VM 114 on remote node 'fmckvm100'
Nov 16 12:44:36 starting migration tunnel
Nov 16 12:44:37 starting online/live migration on port 60000
Nov 16 12:44:39 ERROR: online migrate failure - aborting
Nov 16 12:44:39 aborting phase 2 - cleanup resources
Nov 16 12:44:40 ERROR: migration finished with problems (duration 00:00:06)
TASK ERROR: migration problems

root@fmckvm100:~# pveversion --verbose
pve-manager: 2.2-30 (pve-manager/2.2/d3818aa7)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-32
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

root@proxmox104:~# pveversion --verbose
pve-manager: 2.2-30 (pve-manager/2.2/d3818aa7)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-32
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

what could possibly be the reason?

Hi.

I dont know whether this will help you - but after A LOT of trial and error I corrected my issue. Basically, the problem I was experiencing was that I had a CEPH.CONF file in my /root of my hypervisor. This seemed to conflict with the migration process somehow. In any event, as soon as I removed this config file (which was not on Node 1 but was on Node 2) my problems went away and I was able to do online migrations.

Only problem I am having now is that backups don't work for Ceph based storage. But then again - its all still new and experimental.

Hope your problems get fixed.

Warren.

Angelo Pantano · Nov 16, 2012

I dont have any ceph.conf

Can I just switch to virtio disks just by editing the files under /etc/pve/qemu-server/ and rebooting the guest?

udo · Nov 16, 2012

Angelo Pantano said:
I dont have any ceph.conf

Can I just switch to virtio disks just by editing the files under /etc/pve/qemu-server/ and rebooting the guest?

Hi,
depends on the guest (how the partitions are mounted - if uuid and/or lvm is used it's works).

Reboot isn't enough - you must stop/start the VM.

Udo

Search

Search

Live Migration Fails - ProxMox VE 2.2

woblit

New Member

dietmar

Proxmox Staff Member

woblit

New Member

Angelo Pantano

Guest

spirit

Distinguished Member

woblit

New Member

Angelo Pantano

Guest

udo

Distinguished Member

We value your privacy