Unable to stop container: operation timed out

well i now have 2.6.18 running on our severs. as we use KVM and open-vz we had to

Code:
aptitude install pve-qemu-kvm-2.6.18 pve-kernel-2.6.18-4-pve
and check /boot/grub/menu.lst to make sure the 2.6.18 kernel is the default

so I'll re enable backing up all containers . I had been backing up all but the ldap one.

if there are any problems I'll post info.

also this is our current config info:
Code:
pveversion -v
pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.18-4-pve
pve-kernel-2.6.18-4-pve: 2.6.18-10
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
 
arf ... I just did a updated Proxmox 1.7 That is 6 days on my 10servers, I can not afford to recut: /

But we shall know if it solved the problem

regards
 
Yes, I know but I can not afford to cut their services 300 clients you see the things its going to scream to the support;)

I expect either the patch or if an update prod on my side of I downgrade the server

And especially whether 2.6.18 is really better, and the driver Intel ® 82576 controller works well on the nod and the vm (download bug has 5K/20K)
 
we have been running for over a week with out the problem. so 2.6.18 kernel seems to be more stable for us.
 
Hello,
We just updated on 2.6.32 and pve-manager 1.7, and a OpenVz container with just httpd freezed...unable to stop it in a decent way so we have rebooted the hard node...

The ways we tried to stop it :

Method 1 :
- "pstree -nup | grep init", to look after the pid of the init
- for each pid, execute "vzpid pid" and you will find your VM's pid
- once determined this pid , with pstree -nup, kill the whole childs of your init
For us, no child for the freezed VM

Method 2 :
- remove ctid.lck dans /var/lib/vz/lock
- run vzctl chkpnt ctid --kill
For us, do not work

We wil go back to 2.6.18 or try KVM for more stability...
 
our system is still running normal proxmox super stable.

have you switched back to 2.6.18 ?


YEs one nod for test

but other no bug :/

kernel 2.6.18 don't run archi-linux :/

for this patch going to fix the kernel Proxmox?
 
I had this problem on 2.6.24 couple of weeks ago, and now on 2.6.32 as well.

- On 2.6.24 everything was fine until the snapshot backup started, after that none of the VE's could be stopped, and I wasn't even able to log in the webinterface (timeout).
- On 2.6.32 webinterface works, but none of the VE's can be stopped or restarted (timeout). Also the 2.6.32 system had a minimum of 10 load caused by the kernel, since no task was showing high CPU usage.

Neither of them can be restarted via normal init process, only shutdown -n is able to reset the host ("do not go through 'init' but go down real fast.")

I reckon it's somehow connected to LVM and snapshots, because it only happens after snapshot backups.
 
Last edited:
is that for openvz or kvm? for us openvz restarts have worked. however I have not tested all our vm's.

also I'm using the 2.6.18 series, as we mainly use openvz. here is a newly installed from debian-6.0-standard_6.0-4 template vz , init 6:


root@fbc152 ~ # date
Thu Jun 2 12:31:16 EDT 2011
root@fbc152 ~ # init 6
root@fbc152 ~ # Connection to fbc152 closed by remote host.
Connection to fbc152 closed.
proxmox4: ~ # ssh fbc152
Linux fbc152 2.6.32-4-pve #1 SMP Wed Nov 24 05:32:29 CET 2010 i686
------------------------------------
vm 2152 fbc152 ldap slave server

fresh squeeze install
-----------------------------------
Last login: Thu Jun 2 12:31:08 2011 from 10.0.7.4
root@fbc152 ~ # date
Thu Jun 2 12:31:28 EDT 2011
 
I got the same result using vzctl . i wanted to check in case somehow ssh and vzctl had differences.


proxmox4: ~ # vzctl enter 2152
entered into CT 2152
root@fbc152 / # date
Thu Jun 2 12:35:30 EDT 2011
root@fbc152 / # init 6
root@fbc152 / # got signal 15
exited from CT 2152
proxmox4: ~ # ssh fbc152
Linux fbc152 2.6.32-4-pve #1 SMP Wed Nov 24 05:32:29 CET 2010 i686
------------------------------------
vm 2152 fbc152 ldap slave server

fresh squeeze install
-----------------------------------
Last login: Thu Jun 2 12:31:27 2011 from 10.0.7.4
root@fbc152 ~ # date
Thu Jun 2 12:35:40 EDT 2011
 
Here is a nice little script to quickly do method 1


Code:
[COLOR=#000000][FONT=verdana]#!/bin/sh[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]echo "Enter the parent process ID"[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]read ppid[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]for i in `ps -ef| awk '$3 == '${ppid}' { print $2 }'`[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]do[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]echo killing $i[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]kill -9 $i[/FONT][/COLOR]
[COLOR=#000000][FONT=verdana]done[/FONT][/COLOR]

sadly .. I am unable to kill any child processes in my containers!!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!