High loads and freezes with latest 2.6.32-7-pve kernel

rahman

Renowned Member
Nov 1, 2010
63
0
71
Hi,

I have cluster setup with 4 nodes and a FC-SAN Shared storage with LVM, which was working perfect for 2 years. We have 29 VMs, all KVM (7 windows 2008 r2 server, 3 windows 2003 server and the rest of them are debian).Yesterday I upgraded the nodes. Then most of the VMs started to show high loads and 3 of them froze with %100 load(debian). Stopping and then starting VMs solve the freeze issue but they froze again and again after working 10-15 mins. These VMs have very very low load and usually idle most of the time.

Booting the nodes with 2.6.32-4-pve kernel solved the problems immediately.

kvm-44:~# pveversion -v
pve-manager: 1.9-26 (pve-manager/1.9/6567)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.9-55+ovzfix-2
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.32-7-pve: 2.6.32-55+ovzfix-2
qemu-server: 1.1-32
pve-firmware: 1.0-15
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-3pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-2
ksm-control-daemon: 1.0-6
 
It seems there are other people who also suffered with this kernel; http://forum.proxmox.com/threads/7023-Proxmox-VE-1-9-released!/page2?p=39824

I run "top" inside the vm and on the host same time. On host the VM process eats cpu and change rapidly; %12, %30, 69%, 36%, %45, %90 while the VM is completely idle.

So how can I debug further? What should I look for? Do we realy stuck with 2.6.32-4-pve? It should be realy good to have a kvm only kernel based on debian kernel without openvz bits for those who use kvm only. Because both 2.6.32-6-pve and 2.6.32-7-pve redhat ones have problems with some of us.
 
You need to find a way to reproduce that bug. We can only debug if we can reproduce the behavior here.

Well, after a few 10 mins, one of the VMs froze again. Can't even ping it. "top" on the host shows a "kvm" process with constant cpu usage above % 200 (it some times max out to % 294)4 but never drops bellow % 200). No errors in dmesg output on the host. I was thinking if it is KSM but I disabled it, still the same.

I will try to find it, but this is a production setup so I will need to switch to 2.6.32-4-pve kernel soon.

Edit: This time it froze in 1-2 min after stop-start the VM. As you see the guest was showing no load until freezing. After it froze, the java vnc terminal also frozen, no ping, no ssh to the guest. Also the other VMs show random peaks on cpu load but they are also idle.

here is the config file:
Code:
kvm47:~# cat /etc/qemu-server/113.conf
name: Giga-Debian
ostype: other
memory: 4096
onboot: 1
sockets: 2
boot: c
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
vlan0: e1000=xx:xx:xx:xx:xx:xx
cores: 1
bootdisk: ide0
ide0: LVM6x450:vm-113-disk-1

froze.png
 
Last edited:
Looks like its about SMP in the guest; with one vcpu it runs without a problem. The guest is Debian 5 which serves apache2+php5+mysql with latest updates btw. I just shutdown it and restarted with 2 vcpu, it froze again in a few seconds this time. I can reproduce it every time. I found this while googling http://wp.libpf.com/?p=373 . I will try and report back.

Edit: Yes, changing "clocksource=acpi_pm" did the trick. It is now running with two vcpu without any abnormal cpu load. Hope this helps you to debug it.
 
Last edited:
Looks like its about SMP in the guest; with one vcpu it runs without a problem. The guest is Debian 5 which serves apache2+php5+mysql with latest updates btw. I just shutdown it and restarted with 2 vcpu, it froze again in a few seconds this time. I can reproduce it every time. I found this while googling http://wp.libpf.com/?p=373 . I will try and report back.

Edit: Yes, changing "clocksource=acpi_pm" did the trick. It is now running with two vcpu without any abnormal cpu load. Hope this helps you to debug it.

Do you run a 2.6.32 kernel inside Debian Lenny, if not can you try? (I remember some issues with 2.6.26 in Lenny)
 
Do you run a 2.6.32 kernel inside Debian Lenny, if not can you try? (I remember some issues with 2.6.26 in Lenny)

2.6.32 kernel from lenny backports fixed the freeze issue. Thanks.
 
great, thanks for feedback!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!