CPU usage > 90% and vm unresponsive

  • Thread starter Thread starter Geoff
  • Start date Start date
G

Geoff

Guest
Hi,

We're using the latest Proxmox 1.7 on Debian Lenny (2.6.32 bpo kernel) with kvm based vm's. We have several 2 node clusters.

The problem is that from time to time (every few days) a vm will spike it's CPU usage up > 90% and become completely unresponsive and require a reset to recover. All the vm's have lvm based disks, use the virtio based disks and nic's. The process for the kvm gives (repeating):


read(18, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0\0\0\0\0\0"..., 128) = 128
rt_sigaction(SIGALRM, NULL, {0x4bf7f0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7f755ca41a80}, 8) = 0
write(17, "\1\0\0\0\0\0\0\0"..., 8) = 8
read(18, 0x7fffd986fa00, 128) = -1 EAGAIN (Resource temporarily unavailable)
timer_gettime(0x3, {it_interval={0, 0}, it_value={0, 0}}) = 0
timer_settime(0x3, 0, {it_interval={0, 0}, it_value={0, 250000}}, NULL) = 0
timer_gettime(0x3, {it_interval={0, 0}, it_value={0, 231582}}) = 0
select(19, [3 8 12 15 16 18], [], [], {1, 0}) = 1 (in [16], left {0, 999997})
read(16, "\1\0\0\0\0\0\0\0"..., 512) = 8
select(19, [3 8 12 15 16 18], [], [], {1, 0}) = 1 (in [18], left {0, 999837})


The version of the Proxmox software:

pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-bpo.5-amd64
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: not correctly installed
vzprocps: not correctly installed
vzquota: 3.0.11-1


Any suggestions?

Thanks
Geoff
 
you do not run a proxmox kernel, therefore noone here can help. use a proxmox kernel and install all packages needed for a full functional proxmox ve system.
 
To clarify:

1.) You do not run a kernel provided by proxmox
2.) You do not run the kvm binary provided by proxmox
 
This often seem to happen when IO load is high on the proxmox server (for example during DRBD resync) but lvm doesn't report any issues. I can try with the e1000 & ide, unfortunately the freeze is unpredictable so it may time some time before happening again.

Thanks
Geoff
 
Thanks I will use all the Proxmox provided packages, just as a side note the reason for staying with the Debian kernel was Ksplice support.

Geoff
 
Thanks I will use all the Proxmox provided packages, just as a side note the reason for staying with the Debian kernel was Ksplice support.

Or simply report the bugs to the right people. If you use kvm binary from Debian, you should report the bugs to them.
 
Could do but I must say using (most of) the Proxmox VE suite has been a great experience and I would rather stay with it. If that means using all the packages I certainly have no problem doing that :)
 
Good morning,

Well we've changed the packages to use Proxmox throughout:

pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-30
pve-kernel-2.6.32-4-pve: 2.6.32-30
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4


But when arriving this morning a vm (kvm based) was frozen on 96% CPU and unresponsive via the console.

More detail on the vm itself, its on a lv on drbd (2 node Proxmox cluster), running Debian Lenny here is it's config:

name: puppetmaster1.xxxxxxxxxxxxxxx
ide2: cdrom,media=cdrom
vlan0: virtio=0A:FF:B4:9E:87:B2
bootdisk: virtio0
virtio0: VM_USE:vm-126-disk-1
ostype: l26
memory: 512
onboot: 1
sockets: 1

Any suggestions?

Thanks
Geoff
 
BTW the KVM irc channel recommends not using stock Debian Lenny as a guest with its 2.6.26 kernel, so I'm going to try 2.6.32 backport. Will post if it solves the issue.