vzctl chkpnt kernel error

damien

New Member
Sep 22, 2009
14
0
1
While I try to migrate a CT, I encounter the following error:

May 14 15:31:37 starting migration of CT 102 to node 'wfsr011' (192.168.192.2)
May 14 15:31:37 container is running - using online migration
May 14 15:31:38 starting rsync phase 1
May 14 15:31:38 # /usr/bin/rsync -aH --delete --numeric-ids --sparse /var/lib/vz/private/102 root@192.168.192.2:/var/lib/vz/private
May 14 15:48:04 start live migration - suspending container
May 14 15:48:04 dump container state
May 14 15:48:04 # vzctl --skiplock chkpnt 102 --dump --dumpfile /var/lib/vz/dump/dump.102
May 14 15:48:04 Setting up checkpoint...
May 14 15:48:04 join context..
May 14 15:48:04 dump...
May 14 15:48:04 Can not dump container: Invalid argument
May 14 15:48:04 Error: page without mapping at b669e000@12084248
May 14 15:48:04 Error: dump_one_vma: funkey page
May 14 15:48:04 ERROR: Failed to dump container state: Checkpointing failed
May 14 15:48:04 aborting phase 1 - cleanup resources
May 14 15:48:04 removing copied files on target node
May 14 15:48:24 start final cleanup
May 14 15:48:24 ERROR: migration aborted (duration 00:16:48): Failed to dump container state: Checkpointing failed
TASK ERROR: migration aborted

I can see the same error in the kernel.log:
CPT ERR: ffff880598c5d000,102 :page without mapping at b669e000@12084248
CPT ERR: ffff880598c5d000,102 :dump_one_vma: funkey page


Do you have a solution for that? I found this bug that might be related: http://bugzilla.openvz.org/show_bug.cgi?id=203
But I can't disable vsyscall as explained.

My environment is:
root@wfsr010:~# uname -a
Linux wfsr010 2.6.32-11-pve #1 SMP Wed Apr 11 07:17:05 CEST 2012 x86_64 GNU/Linux

root@wfsr010:~# vzctl
vzctl version 3.0.30.2-11.git.aefc8ef


thanks a lot in advance.
Regards
 
Last edited:
Please can you test with the latest kernel from the 'pvetest' repository?


Hi, I tested with 2.6.32-12-pve, same problem:

root@wfsr010:~# vzctl chkpnt 102
Setting up checkpoint...
suspend...
dump...
Can not dump container: Invalid argument
Error: page without mapping at b66b6000@23695640
Error: dump_one_vma: funkey page
Checkpointing failed

kern.log:
May 18 18:26:01 wfsr010 kernel: CPT ERR: ffff880600db8000,102 :page without mapping at b66b6000@23695640
May 18 18:26:01 wfsr010 kernel: CPT ERR: ffff880600db8000,102 :dump_one_vma: funkey page
 
Last edited:
post also 'pveversion -v'
 
post also 'pveversion -v'


root@wfsr010:~# pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-12-pve
proxmox-ve-2.6.32: 2.1-68
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-12-pve: 2.6.32-68
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
 
How can I reproduce that bug? What OS template/software do you run?


Hello,

I used this template for the CT:
http://download.openvz.org/template/precreated/ubuntu-10.04-x86.tar.gz

I updated (apt-get dist-upgrade) it and modified it slightly (mostly apache conf, mysql, etc...). It is a LAMP server.


A few notes:
_ when the CT is stopped, I can migrate it without problems.
_ I can reproduce this problem with multiple proxmox hosts.
_ I made some tests with an old Gentoo CT a few weeks ago and I did not encountered this problem.

here is its config:
root@wfsr010:~# more /etc/vz/conf/102.conf
ONBOOT="yes"


PHYSPAGES="0:917504"
SWAPPAGES="0:512M"
KMEMSIZE="1708130304:1879048192"
DCACHESIZE="853540864:939524096"
LOCKEDPAGES="458752"
PRIVVMPAGES="unlimited"
SHMPAGES="unlimited"
NUMPROC="unlimited"
VMGUARPAGES="0:unlimited"
OOMGUARPAGES="0:unlimited"
NUMTCPSOCK="unlimited"
NUMFLOCK="unlimited"
NUMPTY="unlimited"
NUMSIGINFO="unlimited"
TCPSNDBUF="unlimited"
TCPRCVBUF="unlimited"
OTHERSOCKBUF="unlimited"
DGRAMRCVBUF="unlimited"
NUMOTHERSOCK="unlimited"
NUMFILE="unlimited"
NUMIPTENT="unlimited"


# Disk quota parameters (in form of softlimit:hardlimit)
DISKSPACE="50G:55G"
DISKINODES="10000000:11000000"
QUOTATIME="0"
QUOTAUGIDLIMIT="0"


# CPU fair scheduler parameter
CPUUNITS="1000"
CPUS="2"
HOSTNAME="wfsv082.XXX.com"
SEARCHDOMAIN="XXX.com"
NAMESERVER="91.121.55.XX 213.186.33.99"
IP_ADDRESS="87.98.186.XX"
VE_ROOT="/var/lib/vz/root/$VEID"
VE_PRIVATE="/var/lib/vz/private/102"
OSTEMPLATE="ubuntu-10.04-x86.tar.gz"
 
I tested with another CT from a 64bits template, and there is the same problem.
I don't know if it might help.

thanks in advance
 
I tested:
# setarch `uname -m` -R vzctl chkpnt 102
and:
# sysctl -w kernel.randomize_va_space=0

as seen here:
www.acsu.buffalo.edu/~charngda/x86assembly.html

but it does not change, I still have the same error:

root@wfsr010:/boot# sysctl -w kernel.randomize_va_space=0
kernel.randomize_va_space = 0
root@wfsr010:/boot# setarch `uname -m` -R vzctl chkpnt 102
Setting up checkpoint...
suspend...
dump...
Can not dump container: Invalid argument
Error: page without mapping at b6696000@155457400
Error: dump_one_vma: funkey page
Checkpointing failed

[FONT=Verdana, Geneva, Lucida, Lucida Grande, Arial, Helvetica, sans-serif]
[/FONT]
 
Did you find anything about this problem? What should I try? Because I'm stuck with proxmox 1.9 for now, I cannot upgrade until this issue is resolved. Thanks in advance.
 
Did you find anything about this problem? What should I try? Because I'm stuck with proxmox 1.9 for now, I cannot upgrade until this issue is resolved. Thanks in advance.



I found the culprit: snort.I you stop snort inside your container before the chkpnt, it works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!