100% CPU on host, VM hang every night

joolsr

New Member
Jun 23, 2010
23
0
1
Hi

I have proxmox running with 1.6 and 2.6.35 kernel. I have 5 VM's that run fine, but one which is proving really problematic. Almost every day it will hang, ie is not reachable from the network or with ssh. The only wait to deal with it is to kill the KVM process. This process will using end up taking 100% of the CPU.

The problems tends to happen sometime after a snapshot has taken place, but sometimes, its ok for a couple of days. I have not yet been able to find out what is going on at the time of the problem, and nothing useful is stored in the logs.

The guest uses Ubuntu 10.04 with Ubuntu desktop on top and uses 2 cpu sockets, and 1 cpu.

load average: 0.64, 0.72, 0.74
CPU(s) 4 x Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
CPU Utilization 13.80%
IO Delays 0.00%
Version (package/version/build) pve-manager/1.6/5261
Kernel Version Linux 2.6.35-1-pve #1 SMP Tue Oct 26 11:05:44 CEST 2010

I thought i was getting somewhere when i saw

Nov 15 10:25:40 openerp kernel: [ 0.000000] ACPI Error: A valid RSDP was not found (20090903/tbxfroot-219)
Nov 15 10:25:40 openerp kernel: [ 0.000000] No NUMA configuration found
Nov 15 10:25:40 openerp kernel: [ 0.000000] Faking a node at 0000000000000000-00000000bb7fd000

in the vm logs, which is related to a bug I read about but switching off acpi in the guest vm config has made no difference.

I have also tried kernel 2.6.18 where it did run fine (which I might have to return to), but 2.6.32 was problematic (ie has the same issue). I'm only KVM at the moment, but would like to stick to 2.6.32 or 2.6.35 if possible. I'm currently using 2.6.35 which again hasn't solved the issue. I could try kvm 0.13, but doubt it will fix the problem.

I would post pveversion but interestingly I get this error:-

vm:/var/log# pveversion
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.utf8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
pve-manager/1.6/5261

Surely this isn't to blame ? And how can I fix the locales ?

If anyone has any ideas about the hanging guest please get in touch, its been faulty for 6 weeks now, and i need to find a resolution.
 
can you try the latest 2.6.32 with kvm 0.13 (from pvetest repo)?

and post the full output of 'pveversion -v'
 
I will try kvm 0.13 later; the box is more a dev box at the moment.

Any idea why you believe kvm 0.13 might help? After all its a fairly plain ubuntu server on it that has the problem ...

pveversion is currently

vm:/var/log# pveversion -v
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.utf8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
pve-manager: 1.6-5 (pve-manager/1.6/5261)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-22
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-8
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.5-2
ksm-control-daemon: 1.0-4
 
I will try kvm 0.13 later; the box is more a dev box at the moment.

Any idea why you believe kvm 0.13 might help? After all its a fairly plain ubuntu server on it that has the problem ...

...

as I do not know your specific problem I cannot point you to the solution. but each new kernel and each KVM version contains a lot of fixes I am pretty sure that it makes sense anyway to go for the latest version.

you need to dig deeper, any high load, can you reproduce/trigger the issue?

what do you get with pveperf?
 
without upgrading to kvm 0.13, i get

vm:~# pveperf
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.utf8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
CPU BOGOMIPS: 19151.36
REGEX/SECOND: 735113
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 148.45 MB/sec
AVERAGE SEEK TIME: 9.27 ms
FSYNCS/SECOND: 1318.18
DNS EXT: 168.22 ms
DNS INT: 36.68 ms (q-par.com)

I hope I'm not asking a stupid question, but how do i actually upgrade to kvm 0.13? I thought i just had to wget a file, but i'm guessing as i dont yet see kvm 0.13 installed i have to enable your testing repo ? I did check the forums but couldnt see the answer
 
ok

i've have upgraded with pvetest repo and am now running with:-

vm:/etc/apt# pveversion -v
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.utf8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
pve-manager: 1.6-9 (pve-manager/1.6/5307)
running kernel: 2.6.35-1-pve
proxmox-ve-2.6.35: 1.6-7
pve-kernel-2.6.35-1-pve: 2.6.35-7
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-2
ksm-control-daemon: 1.0-4

and now get :-

vm:/etc/apt# pveperf
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_GB.utf8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
CPU BOGOMIPS: 19151.36
REGEX/SECOND: 652895
HD SIZE: 94.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 148.08 MB/sec
AVERAGE SEEK TIME: 8.75 ms
FSYNCS/SECOND: 1649.20
DNS EXT: 47.74 ms
DNS INT: 36.49 ms

some improvements it seems here ...
 
sorry i misread your post, did you mean upgrade to 2.6.32 with kvm 0.13, or did you mean 2.6.35 ??? (which i ended up doing ..)
 
Hi,
run a "dpkg-reconfigure locales" and select the right locales.
This don't help by the kvm-problem but looks better for command output ;-)

What kind of display-type do you use in the VM? I had a simmiliar problem with vga=vmware - after some days the machine hang. With normal setting run's for month.

Udo
 
I've had similar problems with several linux guests. Solved basically with:

- "args: -no-hpet" in /etc/qemu/XXX.conf on the host
- appending "clocksource=acpi_pm" on the kernel line in grub on the guest

After those mods I had no freezes anymore.
 
  • Like
Reactions: Chiaki
Udo, you may have it !

I AM using vga=vmware

I changed to this when I upgraded to 2.6.32, as the default graphics arent very good for me.

I will try this and keep my fingers crossed !

Many thanks for your help. Whether I keep running KVM 0.13 and pvetest repo I dont know.
 
Well uptime is now nearly 4 days far longer than the usual one day or so when the vm would normally hang. Looks like chnaging back to normal vga driver has fixed this - thanks UDO ;-)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!