MP-BIOS bug: 8254 timer not connected to IO-APIC ???????

Nikole · Feb 21, 2014

I am getting this message on the console when I start some guests:

ACPI: Core revision 20060707
Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter.

I have transfered (backup-restore) two KVMs from one server to another and on the "new" server when I start each of the KVMs I am getting the above error.
I have searched the forums and some discussions relate to speed that KVMs are started when the node server starts...but here we are talking a about a running node. I never had this problem with those guest when they were on the other node.

If I STOP and START the KVMs (each one individually) its happening again but after a few times it works.
What is causing this?

--
root@prx-a:~# pveversion -v
proxmox-ve-2.6.32: 3.1-121 (running kernel: 2.6.32-27-pve)
pve-manager: 3.1-43 (running version: 3.1-43/1d4b0dfb)
pve-kernel-2.6.32-27-pve: 2.6.32-121
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-13
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-4
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
root@prx-a:~#
--

Nikole

joelserrano · Jun 16, 2014

Hi,

Did you get this solved?

I am facing the same problem:

a) backup VM from host1.
b) import VM to host2.
c) VM works perfectly on host1
d) VM fails to boot on host2 with error: Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter.
e) If I STOP and START the VM, after a few times it boots correctly.

VM info:

Code:

[root@vm~]# cat /etc/redhat-release 
CentOS release 5.7 (Final)
[root@vm~]# uname -a
Linux vm 2.6.18-238.12.1.el5 #1 SMP Tue May 31 13:22:04 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@vm~]#

Host1 info:

Code:

root@pve1:/etc/pve# pveversion -v
proxmox-ve-2.6.32: 3.2-124 (running kernel: 2.6.32-28-pve)
pve-manager: 3.2-2 (running version: 3.2-2/82599a65)
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-14
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-6
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
root@pve1:/etc/pve#

Host2 info:

Code:

root@pve2:~# pveversion -v
proxmox-ve-2.6.32: 3.2-124 (running kernel: 2.6.32-28-pve)
pve-manager: 3.2-2 (running version: 3.2-2/82599a65)
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-15
pve-firmware: 1.1-2
libpve-common-perl: 3.0-14
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-6
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1
root@pve2:~#

Thanks in advanced.

Best regards,
Joel.

joelserrano · Jul 17, 2014

Hi,

Just to give more info.

So far, we have tried:

a) trying differente kernel boot parameters (noapic, etc.) --> no luck.
b) changed CPU from default (kvm64) to qemu64 --> no luck.

c) disabled ACPI on VM config in Proxmox GUI. --> So far seems to fix the issue.

Does anyone know the downsides of disabling ACPI on the VM options? How does this affect regarding the VM performance?

Best regards,
Joel.

mir · Jul 17, 2014

If you disable ACPI you can not use the stop, shutdown buttons in the gui. Moreover HA will not work since it relies on ACPI.

joelserrano · Jul 19, 2014

Hi mir,

Thanks for your response! And any clue how to get over this?

I really think It has to be something related to the guest VM (running specifically Centos 5.X, my case is 5.6) because VMs with newer OS (Centos 6.X, Ubuntu 12/14, Debian 6/7) don't have that problem never.

I don't understand why after 3-4-5 tries of STOP/START it normally ends up working ¿?¿?.

I am afraid that the moment comes were a VM reboots and does't manage to boot up again because of the kernel-panic appearing every time..

Any ideas are more than welcome.

Best regards,
Joel.

joelserrano · Jul 19, 2014

Just to add extra info:

CPUs are not the same over nodes... Our cluster has:

pve1: 2x Intel(R) Xeon(R) CPU E5-2667 0 @ 2.90GHz
pve2: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
pve3: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
pve4: 2x Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
pve5: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz

mir · Jul 19, 2014

Since you are using different CPU's on your nodes I am quit certain this is a problem with cpu flags. Your task is then to find the lowest common denominator for CPU for your VM's. The lowest common denominator for CPU is qemu(32|64) so start by assigning qemu as CPU for the troubled VM an see whether this solves your problem. If this works you can try kvm(32|64) an so forth.

joelserrano · Jul 22, 2014

Hi mir,

I understand, that would explain problems when doing live migrates between pve nodes. But once you have actually "stopped" and "started" the VM, way does it continue happening?

Anyway, I've tried with both CPUs qemu64 and kvm64 and still happens... (in fact, I hardly ever use "host" as the CPU setting, so that shouldn't be the problem in first place).

Best regards,
Joel.

joelserrano · Jul 23, 2014

More info:

pve1: 2x Intel(R) Xeon(R) CPU E5-2667 0 @ 2.90GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid

pve2: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid

pve3: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid

pve4: 2x Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid

pve5: 2x Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid

Flags obtained from /proc/cpuinfo

Best regards,
Joel.

mir · Jul 23, 2014

I think your problem is that the new CPU which is based on Ivi bridge chipset is incompatible with the other CPU's which is based on Westmere chipset.

rossig · Aug 22, 2014

I have the same problem, but i think is not related to cpu model. Kernel Panic start when I upgrade Proxmox from v. 3.21 to the latest v. 3.2.

In some VM (RedHat 5.x) , when starting from poweroff, I obtain the kernel panic error. No change was made to VM configurations after Proxmox Upgrade. Actually I have a two node cluster and I haven't upgraded the other node and if I transfer the VMs to old proxmox they boot without problem. I tried to change Vcpu from KVM64 to Quemu64, but problem remain.

My host is octa-core:
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : Quad-Core AMD Opteron(tm) Processor 2356
stepping : 3
cpu MHz : 2299.942
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall mmxext fxsr_opt pdpe1gb rdts
cp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalign
sse 3dnowprefetch osvw ibs npt lbrv svm_lock
bogomips : 4599.88
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

mir · Aug 22, 2014

It is a CPU model problem. The psychical CPU determines which CPU flags will be activated and some of these CPU flags is not backwards compatible resulting in kernel panics. Qemu/KVM, VmWare, Xen, Hyper-V does not either give any guaranties when migrating from newer to older CPU model. From older to new is supported by all. Consider it this way: A new CPU will always feature wise be a superset of an older CPU and you will only have a migration guaranty if source CPU is a proper subset of destination CPU.

rossig · Aug 22, 2014

Thanks for Your answer, but if the problem is related to physical CPU why, on the same hardware, the behaviour changed when I upgrade proxmox from 3.1 to 3.2 ?
With proxmox 3.1, same VMs and same host, I never had this problem and no changes where made to physical CPUs or other hardware related component.

edhadzi · Dec 26, 2014

Add no_timer_check as kernel parametar

http://www.spinics.net/linux/fedora/fedora-cloud/msg03961.html

Search

Search

MP-BIOS bug: 8254 timer not connected to IO-APIC ???????

Nikole

Well-Known Member

joelserrano

Member

joelserrano

Member

mir

Famous Member

joelserrano

Member

joelserrano

Member

mir

Famous Member

joelserrano

Member

joelserrano

Member

mir

Famous Member

rossig

Member

mir

Famous Member

rossig

Member

edhadzi

Renowned Member