pve-kernel-5.0.21-4-pve cause Debian guests to reboot loop on older intel CPUs

model : 15 model name : Intel(R) Core(TM)2 Quad CPU @ 2.40GHz

That one is well a bit over 10 years old... Have to see if I can get my hand on something a bit older here..

@Chriswiss what's your lscpu output?
 
cpuinfo (microcode update being applied through intel-microcode deb package version 3.20190618.1):

There's no way that Intel has some microcode updates for your specific model in their updates since >5 years, IMO.

But, what you could try temporarily are two things:
  1. Install the 5.3 based kernel we're currently evaluating for the next Proxmox VE release, see here (that kernel is now also available on pve-no-subscription)
  2. add the following to your boot kernel commandline noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off, e.g., in the /etc/default/grub file in the GRUB_CMDLINE_LINUX="<flags here>" variable, and run update-grub+reboot
 
Problem is not limited to linux guests. After Updating kernel to 5.0.21-4-pve, all my VMs have boot problem, several linux distros, free bsd, windows 10.
Boot process starts and crashes very sun, I think when VM try to switch to protected mode.
With 5.0.21-3-pve everything works fine.
I found, with kernel 5.0.21-4-pve, when I set KVM hardware virtualization to No, everything works, but, as expected, very slow.
Hope this will help to find the problem.
Best regards, Igor.

My machine is:
Code:
HP Proliant ML150 G5, 2 x Intel Xeon E5420@2.50GHz, 16GB RAM

proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-11 (running version: 6.0-11/2140ef37)
pve-kernel-helper: 6.0-11
pve-kernel-5.0: 6.0-10
pve-kernel-5.0.21-4-pve: 5.0.21-8
pve-kernel-5.0.21-3-pve: 5.0.21-7
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-3
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-6
libpve-guest-common-perl: 3.0-2
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-10
pve-docs: 6.0-8
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-4
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-4
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-13
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2
 
HP Proliant ML150 G5, 2 x Intel Xeon E5420@2.50GHz, 16GB RAM

Also over 10 years old (CPU release and its support-EOL wise)...

Can one of you please try the pve-kerne-5.3 package? It'd be really interesting if it gets worse then or not.
 
That one is well a bit over 10 years old... Have to see if I can get my hand on something a bit older here..

@Chriswiss what's your lscpu output?
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 38 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 23
Model name: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
Stepping: 10
CPU MHz: 2499.991
BogoMIPS: 4999.98
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 6144K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
 
OK, fitting the pattern, decade old HW :)

So, one thing which could be also nice to try (until I get the 10year old workstation we dusted of installed) would be to set a VM CPU model other than kvm64, e.g., core2duo or Conroe.
The guess here is that a change of the KVM kernel module exposes now something to the guest OS which previously wasn't (e.g., CPU instruction), once triggered it crashes the guest kernel as it does not work on very old CPUs (as said a guess).
 
OK, fitting the pattern, decade old HW :)

So, one thing which could be also nice to try (until I get the 10year old workstation we dusted of installed) would be to set a VM CPU model other than kvm64, e.g., core2duo or Conroe.
The guess here is that a change of the KVM kernel module exposes now something to the guest OS which previously wasn't (e.g., CPU instruction), once triggered it crashes the guest kernel as it does not work on very old CPUs (as said a guess).

Same as with core2duo or Conroe.

I will try with a 32bit infrastructure.
 
OK, I can reproduce it now; just required to actually burn a DVD with the ISO to get it to install on a Intel Core2Duo CPU E8500 host..

Now, I'm going to do some bisecting, let's see what the culprit is :)

btw.: Could also be related to this kernel.org bug report: https://bugzilla.kernel.org/show_bug.cgi?id=205441
 
OK, I can reproduce it now; just required to actually burn a DVD with the ISO to get it to install on a Intel Core2Duo CPU E8500 host..

Now, I'm going to do some bisecting, let's see what the culprit is :)

btw.: Could also be related to this kernel.org bug report: https://bugzilla.kernel.org/show_bug.cgi?id=205441

Perfect!

Thank you. Just if you haven't seen it, but according to my tests, the problem only occurs on 64-bit VMs.
 
I would like to report similar problem with pfSense VM with same behavior when using pve-kernel-5.0.21-4-pve.
The VM stalls at kernel panic.
I reverted back to the previous kernel and the VM worked fine.
The VM is using CPU of "host"
Proxmox host is using Dell Optiplex 780 core2duo E8500.
 
core2duo E8500

The same model I'm currently reproducing and tracking this issue down :) My bisecting advances nicely, so maybe I know what the underlying issue is, soon.
 
There's no way that Intel has some microcode updates for your specific model in their updates since >5 years, IMO.

But, what you could try temporarily are two things:
  1. Install the 5.3 based kernel we're currently evaluating for the next Proxmox VE release, see here (that kernel is now also available on pve-no-subscription)
  2. add the following to your boot kernel commandline noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off, e.g., in the /etc/default/grub file in the GRUB_CMDLINE_LINUX="<flags here>" variable, and run update-grub+reboot

I didn't say the microcode update was recently added to that package, I'm just noting the version for completeness. Because yes, it gets updated at boot (to something most likely >5 years old as you said).

Thanks for the workarounds. Regardless of it being a 10 year old CPU, like most people storage is my bottleneck. It runs everything just as fast my more recent processors as long as AES isn't involved, so I'd like to spend an extra decade on it if possible.
 
Last edited:
Thanks for the workarounds. Regardless of it being a 10 year old CPU, like most people storage is my bottleneck. It runs everything just as fast my more recent processors as long as AES isn't involved, so I'd like to spend an extra decade on it if possible.

A CPU from then having 90 W TDP is replacable with one with ~ 6 W TDP nowadays, just saying..
Also, newer features like VT-d and AMD-Vi could speedup VM IO, as can other recent processor features indirectly.
Further, my test-host having those issues supports barely AHCI, SSDs work so-lala, PCIe 3? Forget it, so how do you even
put fast storage in there? My newer workstations optane and NVMe devices are pretty damn fast, the CPU needs to hurry up to move stuff down the line AND do some other sense-full (VM CPU) work at the same time. So, there may be some workloads were it may not matter much, but I'm not really sure it holds for the majority.

My workloads, namely mainly compiling and having loads of nested VMs running, profit from core-count, core-speed, and all of the newer CPU features (SIMD, instructions to mitigate performance impact of meltdown, spectre, ..).

But hey, if you're happy, the electric bill isn't to big, and hopefully at least powered by renewable energy: good for you.

But as always, once things fade out from popular use: prepare to run into more issues using it in modern environments and prepare to make your hands more often dirty yourself; nobody wants to break old things (we do not profit from planned obsolescence ;) ) but if nobody tests it anymore, bitrot is unavoidable; sooner or later - just my 2cents.
 
  • Like
Reactions: commander-in-chief
I do live in a place which has one of the highest percentage of renewable energy on the planet. My carbon footprint is likely lower than yours if your energy relies on fossil fuels. I would suggest to not run your workloads on my hardware if that's your concern, as I have none of your needs. I have VT-d available when I need it, and I know what they do...

Regardless, the only issue I've ran into during the whole ten years are my Debian 9 guests failing to boot after a kernel upgrade, but I appreciate the support. I don't have NVMe storage, and it would cost hundreds of dollars to upgrade just so I could brag on forums with no purpose. I don't really need it.

Here's the tests I was planning to do unless they are no longer necessary:

  • Debian 10 guests
  • Grub argument list you provided
  • Setting processor to particular values (I had already tried kvm64 and host IIRC)
 
Last edited:
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!