Online Migration Failure

philister

Member
Jun 18, 2012
31
0
6
Hello all,

I have a three node cluster with iSCSI backed LVM storage for my VMs. Online migration works fine for all but one VM. The error given is

Dec 03 13:23:20 starting migration of VM 111 to node 'pmx0' (10.0.91.200)
Dec 03 13:23:20 copying disk images
Dec 03 13:23:20 starting VM 111 on remote node 'pmx0'
Dec 03 13:23:21 ERROR: online migrate failure - command '/usr/bin/ssh -o 'BatchMode=yes' root@10.0.91.200 qm start 111 --stateuri tcp --skiplock --migratedfrom pmx3' failed: exit code 255
Dec 03 13:23:21 aborting phase 2 - cleanup resources
Dec 03 13:23:22 ERROR: migration finished with problems (duration 00:00:02)
TASK ERROR: migration problems

Has anybody a clue what that could mean?

Thank you very much.
 
you run the same version on all nodes? check with 'pveversion -v'.
 
post your:


  • /etc/pve/qemu-server/111.conf
  • /etc/pve/storage.cfg
  • pveversion -v
 
bootdisk: ide0
cores: 2
ide0: san1-vdisk001:vm-111-disk-1
ide2: none,media=cdrom
memory: 6144
name: win3
net0: rtl8139=CA:E9:58:05:12:36,bridge=vmbr0
ostype: win7
sockets: 1




dir: local
path /var/lib/vz
content images,iso,vztmpl,rootdir
maxfiles 0

nfs: NFS01
path /mnt/pve/NFS01
server 10.0.90.254
export /vol/esx_ds0
options vers=3
content images,iso,backup
maxfiles 2

lvm: san1-vdisk001
vgname san1-vdisk001
shared
content images




pve-manager: 2.2-30 (pve-manager/2.2/d3818aa7)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-32
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1




Thank you.
 
Hello,

I shut down the VM that wouldn't migrate online in order to migrate it offline. I successfully moved it from node 3 to node 0. Now, I can't start it anymore. Also, I cannot migrate it back to node 3. And also, I cannot make a backup in order to restore it into a clean configuration.

In all three cases (failing start, failing migration, failing backup) the error message given is:


can't activate LV '/dev/san1-vdisk001/vm-111-disk-1': device-mapper: create ioctl on san1--vdisk001-vm--111--disk--1 failed: Device or resource busy


Any help appreciated, this is our company's main Mailserver and I can't bring it up again. Thank you very much.
 
After rebooting two nodes of our three node cluster one after the other (the one I migrated from and the one I migrated to), I can now start the VM again. Haven't dared to try and migrate it yet ...

I can't believe such a dead-lock of a LV can't be sorted out without rebooting. I googled a lot, but couldn't find anything useful.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!