VM's Hanging

  • Thread starter Thread starter james.cookie
  • Start date Start date
J

james.cookie

Guest
Hi,

I am having a problem with a proxmox server and its VM's. The problem is that the VM's stop responding after a few hours (sometimes they will stay up for a couple of days, sometimes only a few hours). I have a cron job running to record the time every minute so that I can see how long each VM stays up for and also external monitoring to let me know when it goes down.

Every time it enters this state and I SSH to it to see what's going on, I notice that the system clock has moved on a few centuries as this is amongst the messages when I log on:

Code:
System information as of Mon May 30 04:39:20 BST 2596

Then it's asking me to change my password as it has (unsurprisingly) expired. A quick "Reset" through the UI and everything is back to normal.

Obviously having to reset this VM's is getting annoying and I want to find a solution.

In order to see if it was a proxmox software problem I set up a test machine with the same OS as the others (Ubuntu Server 10.04) and accessed just via the VNC console. I did not install anything (so just a vanilla server) and added the same cron job to record a timestamp. This machine has never had the problem, which leads me to conclude that it must be something I have installed on the machines. However, I have another proxmox server running a similar set of VM's and never had a problem.

The only things I have installed on the VM's are as follows (and all the dependencies):

VM1
openssh-server
openjdk-6-jre-headless
tomcat6
apache2
libapache2-mod-jk
mailutils

VM2
openssh-server
openjdk-6-jre-headless
tomcat6
postgresql

The proxmox version is:

Code:
# pveversion -v
pve-manager: 1.9-24 (pve-manager/1.9/6542)
running kernel: 2.6.32-6-pve
pve-kernel-2.6.32-6-pve: 2.6.32-43
qemu-server: 1.1-32
pve-firmware: 1.0-13
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.28-1pve5
vzdump: 1.2-15
vzprocps: 2.0.11-2
vzquota: 3.0.11-1dso1



I'm hoping someone out there has had a similar problem and can point the way to some help!

Regards
James

PS. I have already checked my clock source: cat /sys/devices/system/clocksource/clocksource0/current_clocksource: kvm-clock
 
you pveversion -v shows that you miss some important packages, eg. kvm (also you got some old ones).

fix it but using the latest stable version, see http://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_to_1.9

you run KVM guests or containers?


Thanks for the reply, I have just done what you suggested and I will wait to see if that improves the situation. My pveversion now reads:

Code:
pveversion -vpve-manager: 1.9-26 (pve-manager/1.9/6567)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 1.9-50
pve-kernel-2.6.32-6-pve: 2.6.32-50
qemu-server: 1.1-32
pve-firmware: 1.0-14
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-3pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1dso1
pve-qemu-kvm: 0.15.0-1
ksm-control-daemon: 1.0-6

I run all my VMs as KVM guests.

After your post I checked my other server and it does have the 'pve-qemu-kvm' package that you pointed out was missing, so fingers crossed that by adding it the problem will now go away.

Regards
James