Trying to recover from grub failure but /dev/pve/root not found

Nov 1, 2019
5
0
6
54
After a power outage yesterday, my system was stuck at the grub rescue> prompt.

Following the steps here https://pve.proxmox.com/wiki/Recover_From_Grub_Failure
I created a flash drive with the Proxmox Installer ISO and changed my BIOS to boot from the USB.

When I run `vgscan` there is no output.
When I run `vgchange -ay` also, no output.

When I try to `mount /dev/pve/root /media/RESCUE/` I get "special device /dev/pve/root does not exist".

Thanks, in advance, for your advice.
IMG_20220805_113221964_resized.jpeg
 
Last edited:
I am hesitant to try anything that could cause damage or complicate the problem. But of course I am fixated on watching for, hoping for someone to offer a course of action I can take to get back to a working system. Bitte.
 
You might consider getting a UPS. Its starts at 50€ for a reasonable one and that way your server will do a graceful shutdown on a power outage not corrupting your disks in the frist place...

First step when rescuing data of a corrupted disk should always be creating a backup of that disk so you can always restore in case something goes wrong. You could for example boot into a clonezilla stick for that and save a image of that disk to a external disk or a NAS.

After that you can boot the PVE ISO in rescue mode and try to write a new grub bootloader.

Vgscan and so on won't show anything as you are using ZFS (probably a raidz1) and not LVM.
 
Last edited:
  • Like
Reactions: WendyMcF
Following the steps here https://pve.proxmox.com/wiki/Recover_From_Grub_Failure
I created a flash drive with the Proxmox Installer ISO and changed my BIOS to boot from the USB.

When I run `vgscan` there is no output.
When I run `vgchange -ay` also, no output.
It does not look like you are using LVM according to the fdisk output and that old Wiki won't fix it for you. (Please use gdisk for GPT drives instead of fdisk next time.)

It appears that you are using ZFS (3-way mirror or raidz1) and have 3 ESP (EFI System Partition), which you might be able to boot from. Can you try booting of each of those 3 drives to see if it boots normally. Your BIOS boot drive selector might show Linux boot manager and/or UEFI boot, try both for each drive.

If this does not work, try to find a guide (or a forum post) to recover a failure to boot with root on ZFS.
 
  • Like
Reactions: WendyMcF
You might consider getting a UPS. Its starts at 50€ for a reasonable one and that way your server will do a graceful shutdown on a power outage not corrupting your disks in the frist place...

First step when rescuing data of a corrupted disk should always be creating a backup of that disk so you can always restore in case something goes wrong. You could for example boot into a clonezilla stick for that and save a image of that disk to a external disk or a NAS.

After that you can boot the PVE ISO in rescue mode and try to write a new grub bootloader.

Vgscan and so on won't show anything as you are using ZFS (probably a raidz1) and not LVM.
The failure of the UPS is what started the adventure. I could use the battery time to properly shutdown my Fedora VM. And then, stupidly forgot that there was a Proxmox host node behind it that would probably also like to be shutdown nicely.

Thank you for the info about vgscan and explaining the distinction between ZFS and LVM.
 
The failure of the UPS is what started the adventure. I could use the battery time to properly shutdown my Fedora VM. And then, stupidly forgot that there was a Proxmox host node behind it that would probably also like to be shutdown nicely.

Thank you for the info about vgscan and explaining the distinction between ZFS and LVM.
You should monitor your UPS with something like NUT. I don't need to shutdown anything manually, the nut-server installed on the PVE host will monitor the UPS using USB and automatically trigger a shutdown of the PVE host. In case you got several hosts you can also use nut-client to shutdown multiple hosts. And you don't need to caredabout shutting down VMs. As soon as you shutdown your PVE host it will automatically shutdown all guests first before shutting down itself.

And like leesteken already said, you got a grub on sda1/sdb1/sdc1 or a ESP on sda2/sdb2/sdc2 to boot from. So probably one of the disks still might contain a healthy bootloader.
 
Last edited:
Having tried leesteken's suggestion about booting from the EFI partitions, so much better already, as I can get to the Proxmox web UI now. (And can ssh to the host node.)

In the BIOS, I set the boot priority to `UEFI OS (P3: SanDisk)` previously I had only tried the non-UEFI partitions.

I immediately began a backup since I would be sad to only have the most recent (...from June... :eek:) so I haven't gone far enough to know if I am out of the woods yet. But very relieved. Thank you both.

Update: all's well. Reunited with my Fedora work VM, same as it ever was.

T h a n k y o u for the time you took to steer me in a successful direction.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!