KVM suddenly not working on a node.

drrighteous

New Member
Dec 14, 2009
20
0
1
Hey all,

I have a cluster running 1.5 (originally 2.6.18) on 6 physical nodes. Each node has a single Xeon E5520.

I had a small KVM (single socket/core, 256M ram, 8G raw hd) running on one of the nodes, when suddenly it crashed. I couldn't launch a VLC connection into in, nor could I stop it. I had to manually kill the kvm instance and I rebooted the node for good measure.

After the reboot, the KVM gets stuck on the "GRUB loading, please wait.", using 100% of its CPU. Yet if I migrate this KVM to another node, it loads fine. Moving it back results in the same problem. I have already tried upgrading to proxmox-ve-2.6.24 with no results.

All the nodes are running the same setup, including:


virt-nyc02:~/var/lib/vz/images/111# pveversion -v
pve-manager: 1.5-9 (pve-manager/1.5/4728)
running kernel: 2.6.24-11-pve
proxmox-ve-2.6.24: 1.5-23
pve-kernel-2.6.24-11-pve: 2.6.24-23
pve-kernel-2.6.18-1-pve: 2.6.18-4
qemu-server: 1.1-14
pve-firmware: 1.0-5
libpve-storage-perl: 1.0-13
vncterm: 0.9-2
vzctl: 3.0.23-1pve11
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.4-1

And here is the kvm instance running on the node:

root 1427 98.7 0.2 485128 15796 ? R 15:18 3:28 /usr/bin/kvm -monitor unix:/var/run/qemu-server/104.mon,server,nowait -vnc unix:/var/run/qemu-server/104.vnc,password -pidfile /var/run/qemu-server/104.pid -daemonize -usbdevice tablet -name ipsec.nyc1.voipjet.com -smp sockets=1,cores=1 -nodefaults -boot menu=on -vga cirrus -tdf -k en-us -drive file=/var/lib/vz/images/104/vm-104-disk-2.raw,if=ide,index=0,boot=on -m 256 -net tap,vlan=0,ifname=vmtab104i0,script=/var/lib/qemu-server/bridge-vlan -net nic,vlan=0,model=rtl8139,macaddr=16:49:4F:9D:D1:F6 -id 104 -cpuunits 1000

Can anyone advise?

Thanks, David
 
other KVM guest run without issues on this problematic host?

if you downgrade to 2.6.18, it works again?
(apt-get install proxmox-ve-2.6.18)
 
Hi
today after a server reboot I had the very same issue - the kvm-box was hanging on "Press F12..." - after downgrading the kernel to 2.6.18 it works fine again...

best
hk
 
other KVM guest run without issues on this problematic host?

if you downgrade to 2.6.18, it works again?
(apt-get install proxmox-ve-2.6.18)

How safe is that to do? I have the same issue with a KVM hanging at the "hit F12" point, and I'm running 2.6.24-8-pve ... If one "downgrades" as you suggest, won't that break other things? And where does that leave you for the longer term ... you can't update/upgrade without hurting your ability to run KVMs?
 
Apparently it is safe enough ... but I'm feeling kind of dumb. I routinely do updates/upgrades thinking that is the safest way to keep things working, but I find out that 1) aptitude and apt-get never upgraded/updated my kernel 2.6.24-8-pve to 2.6.24-11-pve and 2) if it had upgraded it to 2.6.32, some other things would end up broken. As in, VMs will not run. Maybe should not do upgrades ....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!