CT checkpoint failed: Interrupted system call

mr.Scamp

New Member
May 26, 2013
8
0
1
Rivne, Ukraine
Hi!

It seems that some CTs cannot be suspended when backup or migration takes place.
Code:
root@app1:~# vzctl chkpnt 111
Setting up checkpoint...
        suspend...
Can not suspend container: Interrupted system call
Error: timed out (10 seconds).
Error: Unfrozen tasks (no more than 10): see dmesg output.
Checkpointing failed
Here comes dmesg output:
Code:
CPT ERR: ffff88033b65c000,111 :timed out (10 seconds).
CPT ERR: ffff88033b65c000,111 :Unfrozen tasks (no more than 10): see dmesg output.
saslauthd     D ffff88061609e2c0     0  9553   9551  111 0x00800004
 ffff880616797dd8 0000000000000082 0000000000000000 ffffffff8104dc65
 0000000116797e48 0000000000000000 0000000000000000 ffffffffa0441720
 0000000000000286 000000010006a01c ffff880616797fd8 ffff880616797fd8
Call Trace:
 [<ffffffff8104dc65>] ? __wake_up_common+0x55/0x90
 [<ffffffff8109b31e>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa0434ea3>] dlm_posix_lock+0x193/0x360 [dlm]
 [<ffffffff8109b350>] ? autoremove_wake_function+0x0/0x40
 [<ffffffffa04f6599>] gfs2_lock+0x79/0xf0 [gfs2]
 [<ffffffff811f2123>] vfs_lock_file+0x23/0x40
 [<ffffffff811f27c3>] fcntl_setlk+0x143/0x2f0
 [<ffffffff8153ff4c>] ? thread_return+0xbc/0x870
 [<ffffffff811b2e07>] sys_fcntl+0xc7/0x550
 [<ffffffff81543115>] ? page_fault+0x25/0x30
 [<ffffffff8100b182>] system_call_fastpath+0x16/0x1b
I have tried to reproduce the issue, and discovered that any CT with running saslauthd process inside cannot be suspended.

The FS for CTs is GFS2, the version of Proxmox is:
Code:
root@app1:~# pveversion -v
proxmox-ve-2.6.32: 3.1-111 (running kernel: 2.6.32-24-pve)
pve-manager: 3.1-13 (running version: 3.1-13/262cf0b8)
pve-kernel-2.6.32-24-pve: 2.6.32-111
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-7
qemu-server: 3.1-4
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-13
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2
root@app1:~# dpkg -l | grep gfs2
ii  gfs2-utils                       3.1.3-1                       amd64        Global file system 2 tools
 
It may be GFS2-related problem,
because the CT is checkpointed successfully when the CTs private area resides on local ext3-formatted partition.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!