[SOLVED] Kernel 5.4.98-1 & Kernel 5.4.124-1 fail to boot

Desmodue

New Member
Jul 9, 2021
4
1
3
58
Hi

Having updated through the GUI one of my cluster servers now fails to boot, stuck at this prompt.

"Found vol group 'pve' using meta data lvm2".

Does not progress from this prompt, I have left it for over an hour.

Haven't tried the other server.

Regressing to 5.4.78-2 works.

I have looked any many similar issues but none quite fit.

checked the lvm.conf comparing to a fresh install that works and there is no difference

I am booting none UEFI, as that was how it was originally installed (Proxmox v5).

Have tried changing the boot to UEFI but get to the same prompt.

Am now a bit stuck as what to do next, this node is part of a cluster, do I remove it from the cluster and rebuild from scratch then re add it ?
 
Try removing the 'quiet' from the kernel-commandline (in the boot-loader hit 'e' to edit the command line and remove the 'quiet') - that should show some more output, which might give you a hint where the problem originates

I hope this helps!
 
Ok, removed quiet from the kernel-commandline, left it for 30 minutes at the stuck mounting root filesystem. this is what I have

[ 5.322722] <mlx4_ib> mlx4_ib_add: counter index 1 for port 1 allocted 1
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... Reading all physical volumes. This may take a while...
Found volume group "pve" using metadata type lvm2
[ 7.9753221] mlx4_en: ens3: Link Up

Nothing really jumping out as an issue.

One thing to note, when booting with 5.4.78-2 there are some errors

blk_update_request: criticsl medium error, dev sda, sector 130932256 op 0x0: (READ) flags0x80700 phys_seg 20 prio class 0
blk_update_request: criticsl medium error, dev sda, sector 130932400 op 0x0: (READ) 0x0 phys_seg 1 prio class 0

Looks like I have a bad boot SSD and the later kernels are not tolerant of this error ?
Is there a best practice for replacing the boot drive? I am running a converged cluster with Ceph.
 
Last edited:
blk_update_request: criticsl medium error, dev sda, sector 130932256 op 0x0: (READ) flags0x80700 phys_seg 20 prio class 0
blk_update_request: criticsl medium error, dev sda, sector 130932400 op 0x0: (READ) 0x0 phys_seg 1 prio class 0

Looks like I have a bad boot SSD and the later kernels are not tolerant of this error ?
you're right - sounds like a bad drive

Is there a best practice for replacing the boot drive? I am running a converged cluster with Ceph.
This depends on how your boot-drive was set up - if it's in RAID (ZFS) follow the reference documentation:
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_zfs_administration (there's a section on that)

if it's installed without any RAID - I'd suggest creating backups of everything (always) and freshly setting up the node on a new disk
and replacing it in the cluster (also covered in the reference documentation https://pve.proxmox.com/pve-docs/

I hope this helps!
 
Hi
Was installed without RAID, I had found a number of "guides" in the forum.
Have managed to remove and rebuild, so all now function correctly.

Wasn't too painful, had a ghost OSD that took some removing but other than that it went well.
 
  • Like
Reactions: Stoiko Ivanov
Glad that worked out well :)
Please mark the thread as 'SOLVED' - that way other users know what to expect - Thanks!
 
Just to add, my ghost OSD in Ceph was due to me installing ceph and starting to configure it before joining the cluster..... patience grasshopper :)

How do I mark this as 'SOLVED' ?
 
Last edited:
How do I mark this as 'SOLVED' ?
Klick 'Edit Thread' above your first post - and select the 'SOLVED' prefix for the thread - for the next time - I went ahead and marked this one as solved for you :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!