VM's crashing

Miktash

Active Member
Mar 6, 2015
67
1
28
I have a couple of proxmox nodes using a shared storage (nas) over nfs.

I have been running proxmox 3.3 for months without many problems. The only problem I had is that I have one VM running Kerio Operator (=Phone system, running on linux) and this VM crashes/freezes randomly. Usually every week. And I couldn't find out why it crashes... It was the only VM crashing and wasn't a priority to investigate...

Yesterday I upgraded all proxmox nodes from 3.3 to 3.4. In order to do that I live migrated all VM's in order to free all nodes, one by one. That way I could upgrade each node without shutting down any VM's.

Today a VM crashed. This VM has never ever crashed before. So it had been running for months on proxmox 3.3. But it crashed on 3.4 within 24 hours...

Now I'm wondering. Could this crash be caused by the live migration between two different host versions? Maybe there's an incompatibility that I didn't know of?
Or is it more likely a bug in the 3.4 release and should I expect more crashes now?

This VM crashed. I could't ssh to the VM anymore but the VM still replied to a ping. So it didn't crash completely. On the console there was just the linux login prompt. No kernel messages or something. But the console was totally unresponsive. I tried the "shutdown" button but it failed with a timeout. Then I hit the "reset" button. It said "ok" but it didn't do anything. So I used the "stop" button. It shut down the VM and then I hit the "start" button. Now the VM is running again.


Anyone has some ideas what could be wrong?
 
Just some more information about the Kerio Operator VM that usually crashes every week:

When this VM crashes there is no ping from the VM. The console is totally unresponsive too. And proxmox reports 100% CPU usage at that time. Hitting the "reset" button has always worked.

Are there any methods to "debug" this thing? I would lik to have it fixed too.
 
Kerio Operator is running Linux 3.2.0-k4-kerio-686-pae kernel version.

My vmid.conf:

bootdisk: ide0
cores: 1
ide0: NAS:104/vm-104-disk-1.qcow2,format=qcow2,size=8001M
ide2: none,media=cdrom
memory: 1024
name: KerioOp1
net0: e1000=00:0C:29:2C:53:1D,bridge=vmbr1
ostype: l26
smbios1: uuid=1301c747-986a-4453-a87c-e8b41f02b31b
sockets: 1


I already tried with ide and scsi. Both have the same problem. So it doesn't seem to be related to the disk driver.

Kerio operator's kernel doesn't support VirtIO...
 
mmm, I think it must be a kernel bug, but as it's custom kernel, you can't change it.


maybe you can try to add in you vmid.conf:


machine: pc-i440fx-2.0 (or 1.7, 1.8, ...)

to remove new qemu features.
 
If it's a kernel bug. Any idea why it doesn't crash on VMWare ESXi then?

I could add machine: pc-i440fx-2.0 to the vmid.conf. But would you recommend 2.0, 1.7 or 1.8 ?

 
If it's a kernel bug. Any idea why it doesn't crash on VMWare ESXi then?

could be a kvm specific bug for example.

I could add machine: pc-i440fx-2.0 to the vmid.conf. But would you recommend 2.0, 1.7 or 1.8 ?

Try to add the bigger versions first. (proxmox 3.4 is qemu 2.1, so you can values under that)

available values are :

pc-i440fx-2.1
pc-i440fx-2.0
pc-i440fx-1.7
pc-i440fx-1.6
pc-i440fx-1.5
pc-i440fx-1.4
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!