ERROR: online migrate failure - aborting

Denter · May 28, 2014

Hello.

We have 2 nodes cluster with external NFS storage and I get this eror when try to migrate any VM online. On both nodes.
Offline migration works normal. Online disk moving to and from NFS storage works too.

I don't want to reboot host servers - it would require to stop some production VMs for offline migration (yes, we have Basic subscription).

Any ideas or recommendations?

Code:

May 28 22:39:04 starting migration of VM 300 to node 'appcloud' (192.168.110.31)
May 28 22:39:04 copying disk images
May 28 22:39:04 starting VM 300 on remote node 'appcloud'
May 28 22:39:05 starting ssh migration tunnel
May 28 22:39:05 starting online/live migration on localhost:60000
May 28 22:39:05 migrate_set_speed: 8589934592
May 28 22:39:05 migrate_set_downtime: 0.1
May 28 22:39:07 ERROR: online migrate failure - aborting
May 28 22:39:07 aborting phase 2 - cleanup resources
May 28 22:39:07 migrate_cancel
May 28 22:39:08 ERROR: migration finished with problems (duration 00:00:04)
TASK ERROR: migration problems

Code:

root@appcloud:/var/log# qm config 300
bootdisk: ide0
cores: 2
ide0: NAS-NFS:300/vm-300-disk-1.qcow2,format=qcow2,size=8G
ide2: none,media=cdrom
memory: 1024
name: ix-manager
net0: e1000=FA:E3:6F:1D:09:5F,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1

Code:

root@appcloud:/var/log# pveversion -v
proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-29-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

Code:

root@app4:/mnt/pve# pveversion -v
proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-28-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-27-pve: 2.6.32-121
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

Code:

root@app4:/etc/pve# cat storage.cfg 
dir: local
    path /var/lib/vz
    content images,iso,vztmpl,rootdir
    maxfiles 0

nfs: NAS-NFS
    path /mnt/pve/NAS-NFS
    server 192.168.110.39
    export /mnt/RAID50/proxmox
    options vers=3
    content images,iso,vztmpl,rootdir,backup
    nodes app4,appcloud
    maxfiles 3

Code:

root@app4:/etc/pve# pvecm status
Version: 6.2.0
Config Version: 2
Cluster Name: PM-Cluster-2
Cluster Id: 48988
Cluster Member: Yes
Cluster Generation: 8
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Node votes: 1
Quorum: 2  
Active subsystems: 5
Flags: 
Ports Bound: 0  
Node name: app4
Node ID: 1
Multicast addresses: 239.192.191.28 
Node addresses: 192.168.110.30

Also with most pve commands in console I get this error, I'm not sure does it impact anything

Code:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_PAPER = "uk_UA.UTF-8",
    LC_ADDRESS = "uk_UA.UTF-8",
    LC_MONETARY = "uk_UA.UTF-8",
    LC_NUMERIC = "uk_UA.UTF-8",
    LC_TELEPHONE = "uk_UA.UTF-8",
    LC_IDENTIFICATION = "uk_UA.UTF-8",
    LC_MEASUREMENT = "uk_UA.UTF-8",
    LC_TIME = "uk_UA.UTF-8",
    LC_NAME = "uk_UA.UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

udo · May 29, 2014

Denter said:
Hello.

We have 2 nodes cluster with external NFS storage and I get this eror when try to migrate any VM online. On both nodes.
Offline migration works normal. Online disk moving to and from NFS storage works too.

I don't want to reboot host servers - it would require to stop some production VMs for offline migration (yes, we have Basic subscription).

Any ideas or recommendations?

Code:

root@appcloud:/var/log# pveversion -v proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-29-pve) ...

Code:

root@app4:/mnt/pve# pveversion -v proxmox-ve-2.6.32: 3.2-126 (running kernel: 2.6.32-28-pve) ...

Hi,
ok not possible without reboot, but does live migration work with the same running kernel?

Do you have different cpu on the hosts?

Also with most pve commands in console I get this error, I'm not sure does it impact anything

Code:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_PAPER = "uk_UA.UTF-8",
    LC_ADDRESS = "uk_UA.UTF-8",
    LC_MONETARY = "uk_UA.UTF-8",
    LC_NUMERIC = "uk_UA.UTF-8",
    LC_TELEPHONE = "uk_UA.UTF-8",
    LC_IDENTIFICATION = "uk_UA.UTF-8",
    LC_MEASUREMENT = "uk_UA.UTF-8",
    LC_TIME = "uk_UA.UTF-8",
    LC_NAME = "uk_UA.UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

Does it helps if you run

Code:

dpkg-reconfigure locales

Udo

Denter · May 29, 2014

Hello, Udo.

Hi,
ok not possible without reboot, but does live migration work with the same running kernel?

If it wasn't possible, there wouldn't be a way to upgrade cluster without stopping VMs. No sence.
And we have other cluster (for development). There are different kernel versions too (2.6.32-27-pve and 2.6.32-28-pve). Online migration work there. Just have checked specially.

Do you have different cpu on the hosts?

One IBM x3650 M4 with two E5-2630 CPU
Other - IBM x3550 M4 with two E5-2630 v2 CPU
Both has 64 GB RAM.

But, online migration has worked before. I have noticed problem after my NFS was down and restarted with few VMs running from it. All problem VMs was restarted, disks moved back to "local" and even NFS disconnected and remounted back. Doesn't matter.

Is there any way to get more details about errors? Some way to increase logging level?

Does it helps if you run
dpkg-reconfigure locales

Yes, it has removed this error with locales. Thank you, Udo.

Search

Search

ERROR: online migrate failure - aborting

Denter

New Member

udo

Distinguished Member

Denter

New Member