Slow restart of OpenVZ guests

zzhjkrqlne

Renowned Member
Oct 16, 2008
38
0
71
Hi,

I'm seeing abnormal restart times on OpenVZ guests after upgrading to Proxmox 1.1 the other day, and was wondering if anyone else is seeing the same.

I've followed steps on http://pve.proxmox.com/wiki/Downloads to upgrade from 1.0, except that I recompiled the kernel from the source available at ftp://pve.proxmox.com/sources/pve-kernel-2.6.24_2009-01-15.tar.gz with a patch from http://openamt.svn.sourceforge.net/viewvc/openamt/amt-rescue-cd/patches/linux-2.6.25.rc8-ider.patch for IDE redirection support on AMT platform.

On 1.0, a standard OpenVZ CentOS 5 guest takes average of 10 seconds to restart, whereas on 1.1 it takes 2 minutes for the same thing.

Upon running strace on the restart process, I see the following on the Proxmox 1.1 installation. (trimmed to show relevant parts only)

# strace -tt vzctl restart
...
10:12:54.169016 write(1, "Stopping container ...\n", 23) = 23
10:12:54.169085 gettimeofday({1233184374, 169104}, NULL) = 0
10:12:54.169120 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=785, ...}) = 0
10:12:54.169176 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=785, ...}) = 0
10:12:54.169233 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=785, ...}) = 0
10:12:54.169301 write(3, "2009-01-29T10:12:54+1100 vzctl :"..., 65) = 65
10:12:54.169379 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, chi
ld_tidptr=0x7f19eb11d760) = 8148
10:12:54.169676 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:12:54.169739 rt_sigaction(SIGCHLD, NULL, {SIG_IGN}, 8) = 0
10:12:54.169795 nanosleep({1, 0}, {1, 0}) = 0
10:12:55.169914 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:12:55.169990 ioctl(4, 0x400c2e05, 0x7ffff3127360) = 0
10:12:55.170070 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:12:55.170127 rt_sigaction(SIGCHLD, NULL, {SIG_IGN}, 8) = 0
10:12:55.170182 nanosleep({1, 0}, {1, 0}) = 0

[last 5 lines above keeps repeating]

10:14:53.211823 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:14:53.211877 ioctl(4, 0x400c2e05, 0x7ffff3127360) = 0
10:14:53.211925 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
10:14:53.211965 rt_sigaction(SIGCHLD, NULL, {SIG_IGN}, 8) = 0
10:14:53.212017 nanosleep({1, 0}, {1, 0}) = 0
10:14:54.212128 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
10:14:54.212199 ioctl(4, 0x400c2e05, 0x7ffff3127360) = 0
10:14:54.212279 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, chi
ld_tidptr=0x7f19eb11d760) = 8371
10:14:54.212689 nanosleep({0, 500000000}, NULL) = 0
10:14:54.712771 ioctl(4, 0x400c2e05, 0x7ffff3127360) = -1 ESRCH (No such process)
10:14:54.712871 write(1, "Container was stopped\n", 22) = 22
10:14:54.713008 gettimeofday({1233184494, 713026}, NULL) = 0

As you can see from above, it almost takes 2 minutes to shut down the guest while waiting for something.

Any suggestions as to what might be causing such delay?
 
I guess that is related to the new init-logger implementation. Do you observer that on all templates, or only centos?
 
I guess that is related to the new init-logger implementation. Do you observer that on all templates, or only centos?

I've only tested out CentOS 4 & CentOS 5 templates, as those are the only ones I'm using at this time. Tried restarting the CentOS 4 guest after reading your post, and it restarts much quicker at 10 seconds or so.

How does new implementation of init-logger have such impact on Proxmox 1.1 and not on 1.0 (and only on CentOS 5 template), and are there any ways around it you can think of?

Thank you for your help.
 
How does new implementation of init-logger have such impact on Proxmox 1.1 and not on 1.0 (and only on CentOS 5 template), and are there any ways around it you can think of?

What I found out so far is that it happens when one service does not correctly daemonize itself on startup. What services/programs do you run? Is there a way to reproduce that behaviour?
 
What I found out so far is that it happens when one service does not correctly daemonize itself on startup. What services/programs do you run? Is there a way to reproduce that behaviour?

It's just a pretty plain CentOS 5 guest, with dnscache running on it via daemontools.

I haven't had a chance to look into this further yet as I've reverted back to the previous version for now. I'll try downloading the fresh copy of appliance from http://download.proxmox.com/appliances/system/centos-5-standard_5.2-1_i386.tar.gz and see if I can reproduce this with it.
 
It's just a pretty plain CentOS 5 guest, with dnscache running on it via daemontools.

Tested a little more, and found that daemontools was causing the CentOS 5 guest to restart slowly. Once daemontools was turned off, CentOS 5 guest restarts in a matter of seconds.

No idea why, but I guess it's easier to find an equivalent program to replace daemontools on my setup.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!