Seems there is something wrong with the certificates - try (on both nodes):
# pvecm updatecerts --force
I guess you need to restart pvedaemon and apache2 (or simply reboot).
Thanks, that worked!
Yes. What kink of container is that exactly (how can I reproduce that bug?)
So, on host1 i create a machine setting its storage to a nfs share, using the ubuntu 11.04 template from openvz wiki. This is the output:
Code:
[COLOR=#000000][FONT=tahoma]Creating container private area (ubuntu-11.04-x86_64.tar.gz)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Performing postcreate actions[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]/bin/cp: preserving permissions for `/var/lib/vz/root/116/etc/crontab.3000': Operation not supported[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Saved parameters for CT 116[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Container private area was created[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK OK[/FONT][/COLOR]
The init.log files that are causing the error below are:
---s--S--T+ 1 root root 0 Jan 13 09:01 /var/lib/vz/root/116/var/log/init.log
---s--S--T+ 1 root root 0 Jan 13 09:01 /mnt/pve/HA_storage/private/116/var/log/init.log
Now i try to live migrate from host1 to host2 and the output is:
Code:
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 starting migration of CT 116 to node 'ks27489' (10.8.0.2)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 container is running - using online migration[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 container data is on shared storage 'HA_storage'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 start live migration - suspending container[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 dump container state[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:17 dump 2nd level quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 initialize container on remote node 'ks27489'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 initializing remote quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@10.8.0.2 vzctl quotainit 116[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 vzquota : (error) Quota check : open 'init.log': Permission denied[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 ERROR: online migrate failure - Failed to initialize quota: vzquota init failed [1][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 start final cleanup[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:05:18 ERROR: migration finished with problems (duration 00:00:01)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK ERROR: migration problems[/FONT][/COLOR]
The only difference i can see is that the init.log permissions of ct 116 are followed by a "+", the init.log of other machines aren't.
The result is that the vm 116 is moved to the other container, it's down, and trying to start it up gives me:
Code:
[COLOR=#000000][FONT=tahoma]Starting container ...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Initializing quota ...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]vzquota : (error) Quota check : open 'init.log': Permission denied[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]vzquota init failed [1][/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK ERROR: command 'vzctl start 116' failed: exit code 61[/FONT][/COLOR]
PS: This is only using live migration, offline works fine.
Another test, i've tried to migrate a machine from host1 to host2, using live migration and storing on local storage not shared one:
Code:
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 starting migration of CT 117 to node 'ks27489' (10.8.0.2)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 container is running - using online migration[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 starting rsync phase 1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/117 root@10.8.0.2:/var/lib/vz/private[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 start live migration - suspending container[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 dump container state[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 copy dump file to target node[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 starting rsync (2nd pass)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 # /usr/bin/rsync -aH --delete --numeric-ids /var/lib/vz/private/117 root@10.8.0.2:/var/lib/vz/private[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 dump 2nd level quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 copy 2nd level quota to target node[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 initialize container on remote node 'ks27489'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 initializing remote quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 turn on remote quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 load 2nd level quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 starting container on remote node 'ks27489'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 restore container state[/FONT][/COLOR]
That seemed to work, but the migration is still running :/
Then i decided to click on "stop" on the live migration, output:
Code:
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 starting migration of CT 117 to node 'ks27489' (10.8.0.2)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 container is running - using online migration[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 starting rsync phase 1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:50 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/117 root@10.8.0.2:/var/lib/vz/private[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 start live migration - suspending container[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 dump container state[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:58 copy dump file to target node[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 starting rsync (2nd pass)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 # /usr/bin/rsync -aH --delete --numeric-ids /var/lib/vz/private/117 root@10.8.0.2:/var/lib/vz/private[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 dump 2nd level quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 copy 2nd level quota to target node[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:09:59 initialize container on remote node 'ks27489'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 initializing remote quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 turn on remote quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 load 2nd level quota[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 starting container on remote node 'ks27489'[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 restore container state[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:26 # /usr/bin/ssh -c blowfish -o 'BatchMode=yes' root@10.8.0.2 vzctl restore 117 --undump --dumpfile /var/lib/vz/dump/dump.117 --skip_arpdetect[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Restoring container ...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Starting container ...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Container is mounted[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 undump...[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Setting CPU units: 1000[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Setting CPUs: 1[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Configure veth devices: veth117.0 [/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:10:00 Adding interface veth117.0 to bridge vmbr1 on CT0 for CT117[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:26 vzquota : (warning) Quota is running for id 117 already[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:26 ERROR: online migrate failure - Failed to restore container: interrupted by signal[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:26 removing container files on local node[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:27 start final cleanup[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]Jan 13 09:32:27 ERROR: migration finished with problems (duration 00:22:38)[/FONT][/COLOR]
[COLOR=#000000][FONT=tahoma]TASK ERROR: migration problems[/FONT][/COLOR]
Don't get why that time mixing.. But the vm is running on the other node, but if i try to console into i get a black screen and after 2 secs "Network error: remote side closed connection"
Trying to stop it gives:
Code:
[COLOR=#000000][FONT=tahoma]TASK ERROR: command 'vzctl stop 117' failed: exit code 9"[/FONT][/COLOR]
Regards