kvm migration fails, exit code 250

bread-baker

Member
Mar 6, 2010
432
0
16
I get this trying to migrate a kvm:

Code:
Executing HA migrate for VM 101 to node fbc241
Trying to migrate pvevm:101 to fbc241...Temporary failure; try again
TASK ERROR: command 'clusvcadm -M pvevm:101 -m fbc241' failed: exit code 250

However 4 other kvm's migrate back and forth with an issue.

Will chack logs and try migration in stop mode...

Any suggestions?
 
here is the conf file :

Code:
# on drbd for high availability
bootdisk: virtio0
cores: 4
cpu: host
ide2: cdrom,media=cdrom
memory: 2048
name: mail-system
net0: virtio=86:CF:B2:A1:41:7C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: drbd-fbc241:vm-101-disk-1
 
and lvs and qm list for the 2 drbd systems:

fb240 where kvm 101 is running
Code:
fbc240 s009 /etc/pve/qemu-server # lvs
  LV             VG          Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert
  vm-1023-disk-1 drbd-fbc240 -wi-ao--  32.01g                                           
  vm-200-disk-1  drbd-fbc240 -wi-ao--  15.02g                                           
  vm-2091-disk-1 drbd-fbc240 -wi-ao--  17.00g                                           
  vm-100-disk-1  drbd-fbc241 -wi-----  32.00g                                           
  vm-101-disk-1  drbd-fbc241 -wi-ao--  32.00g                                           
  vm-102-disk-1  drbd-fbc241 -wi-----   2.01g                                           
  vm-103-disk-1  drbd-fbc241 -wi-----   4.00g                                           
  vm-103-disk-2  drbd-fbc241 -wi----- 120.00g                                           
  vm-115-disk-1  drbd-fbc241 -wi-----   9.01g                                           
  bkup           fbc240-vg   -wi-ao-- 362.60g                                           
  data           pve         -wi-ao-- 465.50g                                           
  root           pve         -wi-ao--  10.00g                                           
  swap           pve         -wi-ao--   8.00g                                           
fbc240 s009 /etc/pve/qemu-server # qm list
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID       
       101 mail-system          running    2048              32.00 20424     
       200 winxp                running    2048              15.02 20343     
      1003 ltsp-term-KVM        stopped    256                0.00 0         
      1023 fbc123-x2go-wheezy   running    1536              32.01 20334     
      2091 asterisk-fbc91       running    1024              17.00 20339

and the othere system
Code:
fbc241 s012 ~ # lvs
  LV             VG          Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert
  vm-1023-disk-1 drbd-fbc240 -wi-----  32.01g                                           
  vm-200-disk-1  drbd-fbc240 -wi-----  15.02g                                           
  vm-2091-disk-1 drbd-fbc240 -wi-----  17.00g                                           
  vm-100-disk-1  drbd-fbc241 -wi-----  32.00g                                           
  vm-101-disk-1  drbd-fbc241 -wi-----  32.00g                                           
  vm-102-disk-1  drbd-fbc241 -wi-ao--   2.01g                                           
  vm-103-disk-1  drbd-fbc241 -wi-ao--   4.00g                                           
  vm-103-disk-2  drbd-fbc241 -wi-ao-- 120.00g                                           
  vm-115-disk-1  drbd-fbc241 -wi-ao--   9.01g                                           
  bkup           fbc241-vg   -wi-ao-- 500.00g                                           
  data           pve         -wi-ao--   1.88t                                           
  root           pve         -wi-ao--  96.00g                                           
  swap           pve         -wi-ao--  23.00g                                           
fbc241 s012 ~ # qm list
qemu-img: Could not open '/dev/drbd-fbc241/vm-100-disk-1': No such file or directory
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID       
       100 wheezy-kvm           stopped    1024               0.00 0         
       102 sarge-edi            running    256                2.01 6017      
       103 etch-kvm-20          running    512                4.00 5768      
       115 squeeze-kvm          running    1024               9.01 6298      
      1001 ltsp-term-KVM        stopped    512                0.00 0         
      8016 fbc16-x2go           running    4096               8.00 4906      
      8030 fbc30-edi-x2go       running    1024               8.00 5183
 
just noticed

qemu-img: Could not open '/dev/drbd-fbc241/vm-100-disk-1': No such file or directory

that is from a different kvm , trying to start thaat results in

Code:
fbc241 s012 ~ # qm start 100
Executing HA start for VM 100
Member fbc241 trying to enable pvevm:100...Aborted; service failed
command 'clusvcadm -e pvevm:100 -m fbc241' failed: exit code 254

Note that on the 2 drbd systems - fbc240 abd fbc241 - I had to add a nic. so at the same time was testing ha and migration for the cluster.
 
Last edited:
more info
Code:
fbc241 s012 ~ # fence_tool ls
fence domain
member count  3
victim count  0
victim now    0
master nodeid 1
wait state    none
members       1 2 4 

fbc241 s012 ~ # clustat
Cluster Status for fbcluster @ Sat May  5 11:09:25 2012
Member Status: Quorate

 Member Name                                       ID   Status
 ------ ----                                       ---- ------
 fbc240                                                1 Online, rgmanager
 fbc246                                                2 Online, rgmanager
 fbc241                                                4 Online, Local, rgmanager

 Service Name                             Owner (Last)                             State         
 ------- ----                             ----- ------                             -----         
 pvevm:100                                (fbc241)                                 failed        
 pvevm:101                                (fbc240)                                 failed        
 pvevm:102                                fbc241                                   started       
 pvevm:1023                               fbc240                                   started       
 pvevm:103                                fbc241                                   started       
 pvevm:115                                fbc241                                   started       
 pvevm:200                                fbc240                                   started       
 pvevm:2091                               fbc240                                   started
 
Code:
fbc241 s012 ~ # clusvcadm -d pvecm:100
Local machine disabling pvecm:100...Service does not exist
[code]

[code]
fbc240 s009 /etc/pve/qemu-server # clusvcadm -d pvecm:100
Local machine disabling pvecm:100...Service does not exist
[code]

next I'll try from http://forum.proxmox.com/threads/9351-HA-errors-and-odd-behaviour-when-adding-servers-to-HA-Cluster?highlight=exit+code+254
"edited /etc/vpe/cluster/cluster.conf to remove debug l..."
 
OK that edit to debug did not work did not find the line.

so i tried
1- remove 100 from cluster on pve page
2- start from cli and got:
Code:
qm start 100
VM is locked (backup)

so
Code:
qm unlock 100
qm start 100

that started it.

I think the issue was caused by rebooting the systems during the weekly backup. My mistake. I had forgotten that the backups were scheduled for the time I came in to add the nic.

so i'll add back 100 to cluster ....

next try to make 101 migrate...
 
OK I got 101 to migrate

tried:
qm unlock 101
[code
Executing HA migrate for VM 101 to node fbc241
Trying to migrate pvevm:101 to fbc241...Temporary failure; try again
TASK ERROR: command 'clusvcadm -M pvevm:101 -m fbc241' failed: exit code 250
[/code]


tried: remove from cluster
then migrate
got this:
Code:
May 05 11:33:44 starting migration of VM 101 to node 'fbc241' (10.100.100.241)
May 05 11:33:44 copying disk images
May 05 11:33:44 ERROR: Failed to sync data - cant migrate local cdrom drive
May 05 11:33:44 aborting phase 1 - cleanup resources
May 05 11:33:44 ERROR: migration aborted (duration 00:00:00): Failed to sync data - cant migrate local cdrom drive
TASK ERROR: migration aborted

so i thought there must be an iso image instead of just a cdrom drive.

that was not the case, there was an ide cdrom ..

so I delete the cdrom hardware then:
Code:
May 05 11:34:42 starting migration of VM 101 to node 'fbc241' (10.100.100.241)
May 05 11:34:42 copying disk images
May 05 11:34:42 starting VM 101 on remote node 'fbc241'
May 05 11:34:43 starting migration tunnel
May 05 11:34:44 starting online/live migration on port 60000
May 05 11:34:46 migration status: active (transferred 53769KB, remaining 909240KB), total 2113920KB)
May 05 11:34:48 migration status: active (transferred 167394KB, remaining 807820KB), total 2113920KB)
May 05 11:35:03 migration status: completed
May 05 11:35:03 migration speed: 107.79 MB/s
May 05 11:35:05 migration finished successfuly (duration 00:00:23)
TASK OK






[/code]
 
from last nights backup, the is the orig 101.conf:
Code:
fbc241 s012 /bkup/rsnapshot-for-systems/daily.0/fbc241/etc/pve/nodes/fbc241/qemu-server # cat 101.conf
#will move mail here
#10.100.1.5   srv5.fantinibakery.com srv5  #  mail wheezy  kvm 2012-05-02
# on drbd for high availability
bootdisk: virtio0
cores: 4
cpu: host
ide2: cdrom,media=cdrom
memory: 2048
name: mail-system
net0: virtio=86:CF:B2:A1:41:7C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: drbd-fbc241:vm-101-disk-1

and the current one :
Code:
fbc241 s012 /etc/pve/qemu-server # cat 101.conf
#will move mail here
#10.100.1.5   srv5.fantinibakery.com srv5  #  mail wheezy  kvm 2012-05-02
# on drbd for high availability
bootdisk: virtio0
cores: 4
cpu: host
memory: 2048
name: mail-system
net0: virtio=86:CF:B2:A1:41:7C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: drbd-fbc241:vm-101-disk-1
 
"
ide2: cdrom,media=cdrom"
The above means a psychical cdrom device with an attached cdrom. If you however, did the following
"ide2: none,media=cdrom" I am quit convinced the migration would have succeeded.
 
from last nights backup, the is the orig 101.conf:
Code:
fbc241 s012 /bkup/rsnapshot-for-systems/daily.0/fbc241/etc/pve/nodes/fbc241/qemu-server # cat 101.conf
#will move mail here
#10.100.1.5   srv5.fantinibakery.com srv5  #  mail wheezy  kvm 2012-05-02
# on drbd for high availability
bootdisk: virtio0
cores: 4
cpu: host
ide2: cdrom,media=cdrom
memory: 2048
name: mail-system
net0: virtio=86:CF:B2:A1:41:7C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: drbd-fbc241:vm-101-disk-1

and the current one :
Code:
fbc241 s012 /etc/pve/qemu-server # cat 101.conf
#will move mail here
#10.100.1.5   srv5.fantinibakery.com srv5  #  mail wheezy  kvm 2012-05-02
# on drbd for high availability
bootdisk: virtio0
cores: 4
cpu: host
memory: 2048
name: mail-system
net0: virtio=86:CF:B2:A1:41:7C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: drbd-fbc241:vm-101-disk-1



Thanks, it works for me the command:

pvecm e 1

Very Thanks. Good works with proxmox!!!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!