Hi, small fun question I wonder if anyone has any thoughts on.
I've got a ~decent box at a client office - it is not latest hardware - was deployed as a 'refurb' lenovo tower server box, dual socket xeon, 96gb ram, megaraid hardware raid
It was originally deployed as Proxmox v6 when that was current, then was updated to v7, and most recently I finally moved it up to latest v8 a few days ago with some scheduled maintenance that was a bit overdue. I am running proxmox-free-community repo version of PVE on here. No subscription/paid repository. (Yes I realize, disclaimer for using it in prod, etc etc).
So, the weird drama. Things seemed fine after the upgrade to latest, Rebooted to new kernel, new things all happy, seems like a clean upgrade
Today - weekend - user tells me 'hey is server down?' and I check - indeed it is down. I can get in via remote IPMI console to see the console of proxmox is on a failed boot init boot loader with sad messages, among them > "alert! /dev/mapper/pve-root does not exist" and it is dropped to init> prompt
after chasing a few red herrings, I found that if I rebooted "warm Ctrl-Alt-Del" style on the server > bootloader GRUB menu > choose the last good kernel before the latest
then things boot just fine, server is up and running in no time.
The newer kernel that makes me sad, is this one:
/boot/vmlinuz-6.8.12-8-pve
and the older one that made me happy, was this one:
/boot/vmlinuz-5.15.158-2-pve
So, for now, I came across another thread that felt kind of similar:
and I for the moment have given up on debugging so I can go enjoy a bit of the weekend, and have the proxmox host booted up operational on the older 5.15 kernel
the workaround was to pin the kernel:
I am wondering if this sounds familiar to anyone, or an easy reason why something like this is broken?
I am wondering if the hardware raid - driver - module is bork on this new kernel for example?
I am using a pretty standard hardware raid card here:
as per
I believe I had some drama a couple of years ago with a (Different)~vintage-ish Dell with Perc (also LSI Megaraid in disguise) that just gave me pain with newer proxmox, and I am kind of wondering if this might be a vaguely related new-improved-drama on that old happy topic. Maybe.
Or possibly something else is going on.
If anyone has any idea-suggestion, or has seen this before, and can say "yes ah there is easy fix" then that would be lovely.
Clearly it is "OK" for now with a working kernel pinned but I am not sure I am entirely thrilled to leave this in place as my long term 'forever' fix.
Any comments suggestions etc are greatly appreciated
thank you,
Tim
I've got a ~decent box at a client office - it is not latest hardware - was deployed as a 'refurb' lenovo tower server box, dual socket xeon, 96gb ram, megaraid hardware raid
It was originally deployed as Proxmox v6 when that was current, then was updated to v7, and most recently I finally moved it up to latest v8 a few days ago with some scheduled maintenance that was a bit overdue. I am running proxmox-free-community repo version of PVE on here. No subscription/paid repository. (Yes I realize, disclaimer for using it in prod, etc etc).
So, the weird drama. Things seemed fine after the upgrade to latest, Rebooted to new kernel, new things all happy, seems like a clean upgrade
Today - weekend - user tells me 'hey is server down?' and I check - indeed it is down. I can get in via remote IPMI console to see the console of proxmox is on a failed boot init boot loader with sad messages, among them > "alert! /dev/mapper/pve-root does not exist" and it is dropped to init> prompt
after chasing a few red herrings, I found that if I rebooted "warm Ctrl-Alt-Del" style on the server > bootloader GRUB menu > choose the last good kernel before the latest
then things boot just fine, server is up and running in no time.
The newer kernel that makes me sad, is this one:
/boot/vmlinuz-6.8.12-8-pve
and the older one that made me happy, was this one:
/boot/vmlinuz-5.15.158-2-pve
So, for now, I came across another thread that felt kind of similar:
Hello, Please help because I don't know what to do anymore, I'm very frustrated and have tried various things but can't do it.
I have 3 proxmox servers with HA and cehp systems, 2 of the servers are dead and can't boot as in the picture below:
Actions that I could do via the Ubuntu livecd but to no avail:
- Set rootdelay=15
- install grub (grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=grub
) and Update group
- update-initramfs -u
I also ask for your advice on how to back up my VM in the case above and take it so I can restore the backup to the new Proxmox...
I have 3 proxmox servers with HA and cehp systems, 2 of the servers are dead and can't boot as in the picture below:
Actions that I could do via the Ubuntu livecd but to no avail:
- Set rootdelay=15
- install grub (grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=grub
) and Update group
- update-initramfs -u
I also ask for your advice on how to back up my VM in the case above and take it so I can restore the backup to the new Proxmox...
- kamrang
- Replies: 13
- Forum: Proxmox VE: Installation and configuration
and I for the moment have given up on debugging so I can go enjoy a bit of the weekend, and have the proxmox host booted up operational on the older 5.15 kernel
the workaround was to pin the kernel:
Code:
proxmox-boot-tool kernel pin 5.15.158-2-pve
I am wondering if the hardware raid - driver - module is bork on this new kernel for example?
I am using a pretty standard hardware raid card here:
as per
Code:
root@pve:/opt/bin# megaclisas-status
-- Controller information --
-- ID | H/W Model | RAM | Temp | BBU | Firmware
c0 | LSI MegaRAID SAS 9240-8i | 0MB | N/A | Absent | FW: 20.13.1-0203
...truncated....
and
root@pve:/opt/bin# lsmod | grep -i mega
megaraid_sas 184320 3
I believe I had some drama a couple of years ago with a (Different)~vintage-ish Dell with Perc (also LSI Megaraid in disguise) that just gave me pain with newer proxmox, and I am kind of wondering if this might be a vaguely related new-improved-drama on that old happy topic. Maybe.
Or possibly something else is going on.
If anyone has any idea-suggestion, or has seen this before, and can say "yes ah there is easy fix" then that would be lovely.
Clearly it is "OK" for now with a working kernel pinned but I am not sure I am entirely thrilled to leave this in place as my long term 'forever' fix.
Any comments suggestions etc are greatly appreciated
thank you,
Tim