pve-kernel-5.0.21-4-pve cause Debian guests to reboot loop on older intel CPUs

TerminalAddict

New Member
Nov 14, 2018
4
0
1
51
Hi all,
I upgrade to pve-kernel-5.0.21-4-pve kernel today, now none of my linux hosts will boot.
The guests load grub, then immediately reboot after the grub timeout.

I reboot the host into pve-kernel-5.0.21-3-pve and everything works well.

I can't seem to uninstall this kernel either (I thought maybe I would just hold on pve-kernel-5.0.21-3-pve kernel):

aptitude -s purge pve-kernel-5.0.21-4-pve
The following packages will be REMOVED:
pve-kernel-5.0.21-4-pve{p}
0 packages upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
Need to get 0 B of archives. After unpacking 269 MB will be freed.
The following packages have unmet dependencies:
pve-kernel-5.0 : Depends: pve-kernel-5.0.21-4-pve but it is not going to be installed
The following actions will resolve these dependencies:

Remove the following packages:
1) proxmox-ve [6.0-2 (now, stable)]
2) pve-kernel-5.0 [6.0-10 (now, stable)]
 
The guests load grub, then immediately reboot after the grub timeout.

Seems like maybe the kernel wasn't correctly installed, I'd guess.

Can you do:
Code:
apt install --reinstall pve-kernel-5.0.21-4-pve

and post the output? Maybe your /boot is to full to be able to store new kernel images? Albeit then the grub entry shouldn't be created..
 
Plenty of disk space on the host:
Code:
df -h
Filesystem                             Size  Used Avail Use% Mounted on
udev                                   4.9G     0  4.9G   0% /dev
tmpfs                                  999M   97M  903M  10% /run
/dev/sda1                              191G  2.5G  179G   2% /
tmpfs                                  4.9G   63M  4.9G   2% /dev/shm
tmpfs                                  5.0M     0  5.0M   0% /run/lock
tmpfs                                  4.9G     0  4.9G   0% /sys/fs/cgroup
/dev/fuse                               30M   56K   30M   1% /etc/pve
172.16.252.100:/raid0/data/containers  3.6T  319G  3.3T   9% /mnt/pve/thecus1U4500
172.16.252.100:/raid0/data/backup      3.6T  319G  3.3T   9% /mnt/pve/thecus1u4500-backups
tmpfs                                  999M     0  999M   0% /run/user/0

and the reinstall of the kernel:

Code:
apt install --reinstall  pve-kernel-5.0.21-4-pve
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 54.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://download.proxmox.com/debian/pve buster/pvetest amd64 pve-kernel-5.0.21-4-pve amd64 5.0.21-8 [54.6 MB]
Fetched 54.6 MB in 19s (2,861 kB/s)
(Reading database ... 56713 files and directories currently installed.)
Preparing to unpack .../pve-kernel-5.0.21-4-pve_5.0.21-8_amd64.deb ...
Unpacking pve-kernel-5.0.21-4-pve (5.0.21-8) over (5.0.21-8) ...
Setting up pve-kernel-5.0.21-4-pve (5.0.21-8) ...
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 5.0.21-4-pve /boot/vmlinuz-5.0.21-4-pve
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 5.0.21-4-pve /boot/vmlinuz-5.0.21-4-pve
update-initramfs: Generating /boot/initrd.img-5.0.21-4-pve
run-parts: executing /etc/kernel/postinst.d/pve-auto-removal 5.0.21-4-pve /boot/vmlinuz-5.0.21-4-pve
run-parts: executing /etc/kernel/postinst.d/zz-pve-efiboot 5.0.21-4-pve /boot/vmlinuz-5.0.21-4-pve
Re-executing '/etc/kernel/postinst.d/zz-pve-efiboot' in new private mount namespace..
No /etc/kernel/pve-efiboot-uuids found, skipping ESP sync.
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 5.0.21-4-pve /boot/vmlinuz-5.0.21-4-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.0.21-4-pve
Found initrd image: /boot/initrd.img-5.0.21-4-pve
Found linux image: /boot/vmlinuz-5.0.21-3-pve
Found initrd image: /boot/initrd.img-5.0.21-3-pve
done

still have guest in a reboot loop :(
see here:
https://www.youtube.com/watch?v=1VhrWo5gtdg
 
Hi,

Same problem here since the kernel update : pve-kernel-5.0.21-4-pve

The reinstall doesn't do anything better.

Chris
 
Hi,

Same problem here since the kernel update : pve-kernel-5.0.21-4-pve

The reinstall doesn't do anything better.

More specific information about host HW and storage setup + guest config and OS would be really appreciated..
 
With pleasure.

I have carried out several tests:
- Deleting all hardware (net, ide, disk)
- Backup and restore VM
- Migration to another node

Information about a newly created test VM:
Code:
bootdisk: scsi0
cores: 1
ide2: local:iso/gparted-live-1.0.0-5-amd64.iso,media=cdrom
memory: 512
name: test
net0: virtio=FA:25:F5:4A:04:77,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: local-lvm:vm-1001-disk-0,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=d22080bf-b80b-4e23-8175-88beddb5d059
sockets: 1
vmgenid: b4971282-baa7-4b12-a731-066e76aae73e

PVE information :
Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-4-pve)
pve-manager: 6.0-11 (running version: 6.0-11/2140ef37)
pve-kernel-helper: 6.0-11
pve-kernel-5.0: 6.0-10
pve-kernel-4.15: 5.4-6
pve-kernel-5.0.21-4-pve: 5.0.21-8
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.17-1-pve: 4.15.17-9
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-3
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-6
libpve-guest-common-perl: 3.0-2
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-10
pve-docs: 6.0-8
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-4
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-4
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-13
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

Node information :
Code:
Server for test --> HP Proliant ProLiant DL380 G5 - Bi-Processor ProLiant DL380 G5 - 8 GB Ram

Just for information, before this morning and the update, the VMs were working well.

Thanks
 
Getting the same issue here but only on my Intel host (Q6600). My AMD host (Phenom II X4 955) does not have the problem.

Both are running Debian Buster with Proxmox 6 using ZFS storage for the VMs.
 
Nothing..

After a cleaning of dsmeg, just this during the startup and reboot of the VM :

Code:
[22838.095401] device tap1001i0 entered promiscuous mode
[22838.156978] fwbr1001i0: port 1(fwln1001i0) entered blocking state
[22838.156982] fwbr1001i0: port 1(fwln1001i0) entered disabled state
[22838.157103] device fwln1001i0 entered promiscuous mode
[22838.157166] fwbr1001i0: port 1(fwln1001i0) entered blocking state
[22838.157169] fwbr1001i0: port 1(fwln1001i0) entered forwarding state
[22838.165330] vmbr0: port 2(fwpr1001p0) entered blocking state
[22838.165335] vmbr0: port 2(fwpr1001p0) entered disabled state
[22838.166114] device fwpr1001p0 entered promiscuous mode
[22838.166175] vmbr0: port 2(fwpr1001p0) entered blocking state
[22838.166178] vmbr0: port 2(fwpr1001p0) entered forwarding state
[22838.177046] fwbr1001i0: port 2(tap1001i0) entered blocking state
[22838.177051] fwbr1001i0: port 2(tap1001i0) entered disabled state
[22838.177208] fwbr1001i0: port 2(tap1001i0) entered blocking state
[22838.177211] fwbr1001i0: port 2(tap1001i0) entered forwarding state

Thanks
 
On the VM, the last message I managed to capture at the boot before the reboot:
Code:
Probing EDD /edd=off to disable)... ok
 

Attachments

  • Capture.PNG
    Capture.PNG
    3 KB · Views: 15
A test on an Ubuntu VM (Kernel 4.15.0-66-generic) normally works on pve-kernel-5.0.21-4-pve .

VM Info:
Code:
bootdisk: scsi0
cores: 2
ide2: none,media=cdrom
memory: 2048
name: Lubuntu
net0: virtio=02:48:4C:D3:7A:1A,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
protection: 1
scsi0: nas1-lvm:vm-810-disk-0,size=8G
scsihw: virtio-scsi-pci
smbios1: uuid=7956b596-4f83-4b15-b3c0-f1c41a7e6faa
sockets: 2
vmgenid: 1dcf1d04-8aec-48d2-9beb-b3f2e3b1ac36
 
Last edited:
On the VM, the last message I managed to capture at the boot before the reboot:
Code:
Probing EDD /edd=off to disable)... ok

that's a red herring, shouldn't be problematic.

A test on an Ubuntu VM (Kernel 4.15.0-66-generic) normally works on pve-kernel-5.0.21-4-pve .

Ah great, so my ubuntu Linux VM test are not much of use for this ...

Sorry if I just oversaw it, what Distro+version is a problematic one for you?
(I only saw gparted mounted in the first config, and that probably isn't the one installed there :) )
 
Thank you very much for your support

Sorry if I just oversaw it, what Distro+version is a problematic one for you?

The problem is on Debian 9 Stretch
 
I was also using a Debian 9 guest. amd64 specifically. I've not tested with Debian 10 guests yet, but I can try tonight.

Code:
agent: 1
boot: c
bootdisk: scsi0
ciuser: debian
cores: 2
ipconfig0: ip=192.168.0.10/24,gw=192.168.0.1
memory: 2048
name: debian
nameserver: 192.168.0.1
net0: virtio=36:06:0E:06:8A:2B,bridge=vmbr0,tag=192
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-200-disk-0,discard=on,size=4G
scsi1: local-zfs:vm-200-disk-1,discard=on,size=10G
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=b408265e-1b83-401e-ba8e-1bc96da8f865
sockets: 1
tablet: 0
vmgenid: 5f3b0ef7-bb1c-4b29-9e07-d199602c51e8

cpuinfo (microcode update being applied through intel-microcode deb package version 3.20190618.1):

Code:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz
stepping        : 7
microcode       : 0x6a
cpu MHz         : 2083.877
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow dtherm
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips        : 4799.62
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz
stepping        : 7
microcode       : 0x6a
cpu MHz         : 1913.303
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 3
initial apicid  : 3
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow dtherm
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips        : 4799.62
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz
stepping        : 7
microcode       : 0x6a
cpu MHz         : 2056.412
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 4
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow dtherm
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips        : 4799.62
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Quad CPU           @ 2.40GHz
stepping        : 7
microcode       : 0x6a
cpu MHz         : 2013.662
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 4
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow dtherm
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips        : 4799.62
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Motherboard is a Gigabyte G41MT-ES2L (rev 1.0), BIOS version F5 (2010/06/02).
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!