openvz migration issue

bread-baker

Member
Mar 6, 2010
432
0
16
I migrated a vz 10003 from node fbc186 to fbc1 , here is the log:

Code:
Dec 11 10:28:09 starting migration of CT 10003 to node 'fbc1' (10.100.100.1)
Dec 11 10:28:09 container is running - using online migration
Dec 11 10:28:10 starting rsync phase 1
Dec 11 10:28:10 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/10003 root@10.100.100.1:/var/lib/vz/private
Dec 11 10:29:59 start live migration - suspending container
Dec 11 10:29:59 dump container state
Dec 11 10:29:59 copy dump file to target node
Dec 11 10:30:00 starting rsync (2nd pass)
Dec 11 10:30:00 # /usr/bin/rsync -aH --delete --numeric-ids /var/lib/vz/private/10003 root@10.100.100.1:/var/lib/vz/private
Dec 11 10:30:07 dump 2nd level quota
Dec 11 10:30:07 copy 2nd level quota to target node
Dec 11 10:30:07 initialize container on remote node 'fbc1'
Dec 11 10:30:08 initializing remote quota
Dec 11 10:30:09 turn on remote quota
Dec 11 10:30:10 load 2nd level quota
Dec 11 10:30:10 starting container on remote node 'fbc1'
Dec 11 10:30:10 restore container state
Dec 11 10:30:11 removing container files on local node
Dec 11 10:30:12 start final cleanup
Dec 11 10:30:12 migration finished successfuly (duration 00:02:03)
TASK OK
[</code]

after that I started the vz on fbc1 , checked  syslog within the container , and all was working well. 

from syslog of node  fbc1, shows it started:
[code]
Dec 11 10:30:10 fbc1 kernel: CT: 10003: started

later I added a nic to node fbc1 , and rebooted . after that I have this issue:

/etc/pve/conf/10003.conf does not exist.

but the private area for 10003 does:
Code:
root@fbc1 /var/log # du -sh /var/lib/vz/private/10003
1.5G    /var/lib/vz/private/10003
 
Last edited:
here is /etc/pve list:
Code:
root@fbc1 /etc/pve # ls *
authkey.pub  cluster.conf  cluster.conf.old  pve-root-ca.pem  pve-www.key  storage.cfg  vzdump.cron

local:
openvz  priv  pve-ssl.key  pve-ssl.pem  qemu-server

nodes:
fbc1  fbc186  fbc192

openvz:
14101.conf  14101.mount  148.conf  148.mount  155.conf  155.mount

priv:
authkey.key  authorized_keys  known_hosts  lock  pve-root-ca.key  pve-root-ca.srl

qemu-server:
10159.conf

and

root@fbc1 /etc # pveversion -v
pve-manager: 2.0-14 (pve-manager/2.0/6a150142)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-54
pve-kernel-2.6.32-6-pve: 2.6.32-54
lvm2: 2.02.86-1pve2
clvm: 2.02.86-1pve2
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.7-1
pve-cluster: 1.0-12
qemu-server: 2.0-11
pve-firmware: 1.0-13
libpve-common-perl: 1.0-10
libpve-access-control: 1.0-3
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve7
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1
 
Just tried another migration, and it worked .

So I have a node to reinstall [ adding a raid card ... ] so we'll use migration to move the vz's.
 
I just migrated 5 openvz's to another node. It is great that this can be done!

had to stop the vz's 1-st else would get this:

try with ct running:
Dec 16 19:22:20 starting migration of CT 128 to node 'fbc197' (10.100.100.197)
Dec 16 19:22:20 container is running - using online migration
Dec 16 19:22:20 starting rsync phase 1
Dec 16 19:22:20 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/128 root@10.100.100.197:/var/lib/vz/private
Dec 16 19:29:19 start live migration - suspending container
Dec 16 19:29:19 dump container state
Dec 16 19:29:19 # vzctl --skiplock chkpnt 128 --dump --dumpfile /var/lib/vz/dump/dump.128
Dec 16 19:29:19 Setting up checkpoint...
Dec 16 19:29:19 join context..
Dec 16 19:29:19 dump...
Dec 16 19:29:19 Can not dump container: Invalid argument
Dec 16 19:29:19 Error: iptables-save exited with 255
Dec 16 19:29:19 ERROR: Failed to dump container state: Checkpointing failed
Dec 16 19:29:19 aborting phase 1 - cleanup resources
Dec 16 19:29:19 removing copied files on target node
Dec 16 19:29:25 start final cleanup
Dec 16 19:29:25 ERROR: migration aborted (duration 00:07:05): Failed to dump container state: Checkpointing failed
TASK ERROR: migration aborted


try with ct stopped:
Dec 16 19:33:55 starting migration of CT 128 to node 'fbc197' (10.100.100.197)
Dec 16 19:33:55 starting rsync phase 1
Dec 16 19:33:55 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/128 root@10.100.100.197:/var/lib/vz/private
Dec 16 19:36:13 dump 2nd level quota
Dec 16 19:36:13 copy 2nd level quota to target node
Dec 16 19:36:14 initialize container on remote node 'fbc197'
Dec 16 19:36:15 initializing remote quota
Dec 16 19:36:16 turn on remote quota
Dec 16 19:36:17 load 2nd level quota
Dec 16 19:36:17 turn off remote quota
Dec 16 19:36:17 removing container files on local node
Dec 16 19:36:19 start final cleanup
Dec 16 19:36:19 migration finished successfuly (duration 00:02:24)
TASK OK
 
Dec 16 19:29:19 Can not dump container: Invalid argument
Dec 16 19:29:19 Error: iptables-save exited with 255
Dec 16 19:29:19 ERROR: Failed to dump container state: Checkpointing failed

Do you use iptables inside the container?
 
No, iptables is not installed:
Code:
aps iptables

p   arno-iptables-firewall                                     - single- and multi-homed firewall script with DSL/ADSL support        
p   iptables                                                   - administration tools for packet filtering and NAT                    
p   iptables-dev                                               - iptables development files                                           
p   iptables-persistent                                        - Simple package to set up iptables on boot                            
p   libiptables-chainmgr-perl                                  - Perl extension for manipulating iptables policies                    
p   libiptables-ipv4-ipqueue-perl                              - Perl extension for libipq                                            
p   libiptables-parse-perl                                     - Perl extension for parsing iptables firewall rulesets

aps is an alias for : alias aps='aptitude search'
 
Let me know if you need more info on this.

these are the process in the system now:
Code:
root@approx /etc/network # ps -afx
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:00 init [2]      
    2 ?        S      0:00 [kthreadd/128]
    3 ?        S      0:00  \_ [khelper/128]
   41 ?        S      0:00 [init-logger]
  217 ?        Ss     0:00 /sbin/portmap
  272 ?        SNs    0:21 /usr/sbin/preload -s /var/lib/preload/preload.state
  306 ?        Sl     0:00 /usr/sbin/rsyslogd -c4
  379 ?        Ss     0:00 /usr/sbin/atd
  432 ?        Ss     0:00 /usr/bin/dbus-daemon --system
  439 ?        Ssl    0:00 /usr/sbin/nscd
  464 ?        Ss     0:00 /usr/sbin/cron
  473 ?        Ss     0:00 /usr/sbin/inetd
  554 ?        Ss     0:00 /usr/lib/postfix/master
  558 ?        S      0:00  \_ qmgr -l -t fifo -u
28894 ?        S      0:00  \_ pickup -l -t fifo -u -c
  563 ?        Ss     0:00 /usr/sbin/sshd -u0
 9439 ?        Ss     0:00  \_ sshd: root@pts/0     
 9442 pts/0    Ss     0:00      \_ -bash
10071 pts/0    R+     0:00          \_ ps -afx
20140 ?        Sl     0:00 /usr/sbin/monit -c /etc/monit/monitrc -s /var/lib/monit/monit.state

the purpose of this system is an approx server.
 
inetd.conf:
Code:
ident           stream  tcp     wait    identd  /usr/sbin/identd        identd
9999            stream  tcp     nowait  approx  /usr/sbin/approx /usr/sbin/approx
9572                    stream  tcp     nowait  nobody /usr/sbin/tcpd /usr/sbin/nbdswapd