online migration issue with updated system (pve-manager/2.0/ff6cd700)

udo

Distinguished Member
Apr 22, 2009
5,977
203
163
Ahrensburg; Germany
Hi,
just updated the 3-node test-cluster and wan't to use the online-migration back to the right node and see following error:
Code:
root@pve2-test2:~# qm migrate 401 pve2-test1 -online
storage 'a_sata_r0' is not available on node 'test'
but the destination don't called test - the right name is pve2-test1:
Code:
root@pve2-test1:~# hostname
pve2-test1
root@pve2-test1:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
172.20.1.64 pve2-test1 pvelocalhost
172.20.2.64 pve2-test1.xxxx.com
Also the storage is available.
The storage is an drbd-storage which is available on pve2-test1 and pve2-test2 (not on pve2-test3)
Code:
root@pve2-test2:~# clustat 
Cluster Status for pvetest-cluster @ Fri Apr  6 22:56:28 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 pve2-test3                                                          1 Online
 pve2-test2                                                          2 Online, Local
 pve2-test1                                                          3 Online
Version on all nodes:
Code:
pveversion -v
pve-manager: 2.0-57 (pve-manager/2.0/ff6cd700)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-65
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-11-pve: 2.6.32-65
lvm2: 2.02.88-2pve2
clvm: 2.02.88-2pve2
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-36
pve-firmware: 1.0-15
libpve-common-perl: 1.0-25
libpve-access-control: 1.0-17
libpve-storage-perl: 2.0-17
vncterm: 1.0-2
vzctl: 3.0.30-2pve2
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
Offline migration also fail with the same error.

Any hint?

Udo
 
CT migration not work.
Code:
root@vzhost1:/# vzmigrate -r no -v --online 192.168.1.2 100
Starting migration of CT 100 to 192.168.1.2
OpenVZ is running...
   Loading /etc/vz/vz.conf and /etc/vz/conf/100.conf files
   Check IPs on destination node: 192.168.1.100
Preparing remote node
   Copying config file
scp: /etc/vz/conf/100.conf: File exists
Error: Failed to copy config file
But when I look at /etc/vz/conf on vzhost2 (192.168.1.2) I can't see this file.
Code:
root@vzhost1:/#ls -la /etc/vz/conf/100.conf | wc -l
0
Try to touch this file
Code:
root@vzhost1:/# touch /etc/vz/conf/100.conf
touch: cannot touch `/etc/vz/conf/100.conf': File exists
Offline migration not work too.

P.S. The problem is localized in the script 'vzmigrate'. The command 'scp $vpsconf root@$host:$vpsconf' can not copy the configuration file for the migrated CT.
 
Last edited by a moderator:
just tested some CT migrations, no issues here (using local storage).
 
Two hosts: vzhost1 (192.168.1.1) and vzhost2 (192.168.1.2). Create cluster (for live migration).
Create test CT with ID 100 (IP 192.168.1.100) on vzhost1. And try to copy with scp 100.conf file from one host to another.
Like that: scp /etc/vz/conf/100.conf root@192.168.1.2:/etc/vz/conf/100.conf
Result: FAIL. File already exist
 
you copy a file manually and yes, then it exists but why do you do this? I do not really understand what you test here, please explain.

I just migrate CT online via gui, no issues here.
 
OK. Try to online migration via GUI
Code:
Apr 07 19:23:50 starting migration of CT 10202 to node 'vzhost1' (192.168.10.99)
Apr 07 19:23:50 container is running - using online migration
Apr 07 19:23:50 container data is on shared storage 'CT'
Apr 07 19:23:50 start live migration - suspending container
Apr 07 19:23:50 dump container state
Apr 07 19:23:51 dump 2nd level quota
Apr 07 19:23:52 initialize container on remote node 'vzhost1'
Apr 07 19:23:52 initializing remote quota
Apr 07 19:23:52 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@192.168.10.99 vzctl quotainit 10202
Apr 07 19:23:52 vzquota : (error) quota check : stat /stor/ct/private/10202: No such file or directory
Apr 07 19:23:52 ERROR: online migrate failure - Failed to initialize quota: vzquota init failed [1]
Apr 07 19:23:52 start final cleanup
Apr 07 19:23:52 ERROR: migration finished with problems (duration 00:00:02)
TASK ERROR: migration problems
After that CT had disappeared on the source host and appeared on the target in the off state. Try to start...
Code:
Starting container ...
Container private area /stor/ct/private/10202 does not exist
TASK ERROR: command 'vzctl start 10202' failed: exit code 43

Check private directory:
- on source host /stor/ct/10202 exist
- on dest host /stor/ct/10202 do not exist

Config file /etc/vz/conf/10202.conf
- on source host do not exist
- on dest host exist

Move 10202.conf to source host and try to start:
Code:
Starting container ...
Initializing quota ...
Container is mounted
Adding IP address(es): 192.168.10.202
Setting CPU units: 1000
Setting CPUs: 1
Container start in progress...
TASK OK

P.S. Hostnames and IP's was changed after new installation
 
I just migrate test CT with default private directory (local "/var/lib/vz") and no issues. but if private directory is not the default - online migration failed.
 
Like that: scp /etc/vz/conf/100.conf root@192.168.1.2:/etc/vz/conf/100.conf
Result: FAIL. File already exist

We keep the configuration on a distributed file system, so that err is correct (means the VM ID is already used in the cluster). For example, you can simply assign/move a VM config from one node to another with:

# mv /etc/pve/nodes/NODE1/openvz/100.conf /etc/pve/nodes/NODE2/openvz/100.conf

Please not the we use some syslinks to make live easier:

/etc/vz/conf ==> /etc/pve/openvz
/etc/pve/openvz ==> /etc/pve/nodes/LOCALNODENAME/openvz
 
Seems you marked your storage as 'shared'?

>Apr 07 19:23:50 container data is on shared storage 'CT'

So what kind of shared storage is that? I assume NFS?
 
CT is local resource. LV's from group STOR
Quote from /etc/fstab
Code:
/dev/mapper/stor-iso on /stor/iso type ext3 (rw)
/dev/mapper/stor-tmpl on /stor/tmpl type ext3 (rw)
/dev/mapper/stor-ct on /stor/ct type ext3 (rw)
 
Seems you marked your storage as 'shared'?

>Apr 07 19:23:50 container data is on shared storage 'CT'

So what kind of shared storage is that? I assume NFS?

Thanks! Uncheck 'Shared' for storage 'CT' solved problem!