Grub2 recovery on ZFS Proxmox VE 3.4

onlime

Renowned Member
Aug 9, 2013
76
14
73
Zurich, Switzerland
www.onlime.ch
Before we are going to switch to ZFS in production, we would like to know exactly how to recover from a failed Grub2 boot.
Previously, with ext3/ext4, we could simply boot from a standard Debian netinst USB-stick, select "Rescue mode" and boot into the rescue system. The Debian netinst does even offer auto-scanning of devices and finds all md-devices and LVM volumes. We could then simply recover Grub2 like this:

Code:
$ mount -t ext4 /dev/pve/root /mnt
$ mount -t proc /proc /mnt/proc
$ mount --rbind /dev /mnt/dev
$ mount --rbind /sys /mnt/sys

$ chroot /mnt /bin/bash
(chroot)$ source /etc/profile
(chroot)$ grub-install /dev/sdb
(chroot)$ grub-install /dev/sdc
(chroot)$ grub-install /dev/sdd
(chroot)$ grub-install /dev/sde
(chroot)$ update-grub2
(chroot)$ update-initramfs -u
Ctrl-D

$ umount /mnt/sys
$ umount /mnt/dev
$ umount /mnt/proc
$ umount /mnt
$ reboot

Whats the recommended procedure with Proxmox VE 3.4 for a ZFS RAID1 setup?
Running debug mode from the Proxmox VE installer, I am able to detect the drives /dev/sda and /dev/sdb. But I cannot get any further as the 'zfs' command is missing.
Thanks for your advice.

Best regards, Philip
 
as a debian netinstall does not support ZFS, you cannot use Debian ISOs.

instead, just use the Proxmox VE ISO (or USB) and start the installation in debug mode.

on the first stop, press CTRL-D to continue and the installer switches to graphic mode - not abort the installation, as you do not want to format your server!

the "abort" will give you a shell and access to the zfs root file system

> zpool import -a

and you can continue as described by you.
 
  • Like
Reactions: chrone
Great, Tom! Thank you very much for the fast response and working solution.
I would just like to correct a terribly important word in your sentence (don't want any other readers to format their disks by mistake...):

on the first stop, press CTRL-D to continue and the installer switches to graphic mode - NOW abort the installation, as you do not want to format your server!

Once I'm on the shell after pressing the "Abort" button, I managed to complete the whole Grub2 recovery as follows:

Code:
$ zpool import -a
cannot mount '/': directory is not empty
 
$ zfs set mountpoint=/mnt rpool/ROOT/pve-1
$ zfs mount rpool/ROOT/pve-1
 
$ mount -t proc /proc /mnt/proc
$ mount --rbind /dev /mnt/dev
$ mount --rbind /sys /mnt/sys
 
$ chroot /mnt /bin/bash
(chroot)$ source /etc/profile
(chroot)$ grub-install /dev/sda
(chroot)$ grub-install /dev/sdb
(chroot)$ update-grub2
(chroot)$ update-initramfs -u
Ctrl-D
 
$ umount /mnt/sys
$ umount /mnt/dev
$ umount /mnt/proc
$ zfs unmount rpool/ROOT/pve-1
Ctrl-D
Ctrl-D

Best regards,
Philip
 
can an admin erase these two posts (an maybe help for the aspect of the good post) thank you
 
Last edited:
Hello :) I have the exact the problem this thread is about, and this is the second time it happens to me, at first I was very happy with ZFS functionalities but now I am beginning to be feel insecure using it in a production environnement...I have a basic two nodes cluster whit the same hardware on both hosts I used the last installer to install my system and during the install I chose my disk to be ZFS and that's all !I don't think my problem is related with here : http://forum.proxmox.com/threads/21642-New-ZFS-3-4-install-booting-to-grub-rescue-prompt since I have a material RAID card and my disks are seen just like the only disk by the system.I already followed the inscription given by tom, onlime and Nemesiz :
Code:
$ zfs import rpool -R /mnt $ mount -t proc /proc /mnt/proc$ mount --rbind /dev /mnt/dev$ mount --rbind /sys /mnt/sys $ chroot /mnt /bin/bash(chroot)$ source /etc/profile(chroot)$ grub-install /dev/sda(chroot)$ update-grub2(chroot)$ update-initramfs -uCtrl-D $ umount /mnt/sys$ umount /mnt/dev$ umount /mnt/proc$ zfs unmount rpool/ROOT/pve-1 Ctrl-D
Everything went smoothly, no error message, grub installed successfully, boot images found... but at the reboot always this same dawn error "unknown filesystem" I tried to change all the boot orders even if I don't think it's relevant in my case (only one virtual disk with the RAID card) sadly, this didn't helped either... I have passed hours trying make this work, searching the web but found nothing really helpful.By chance I had created à script ( witch keep the hosts perfectly in sync so I could reboot all my importants VMs but now I am working without security and if my current host fail i could loose data (I have backup somewhere else but it's just once a day not real time)I could reinstall one more time and ask one more time Brigit to reissue the key (lol I will driving her crazy) but I really need to understand and be able to repair in order to feel quiet, safe and go back in production with ZFS.To be exhaustive and give you all the informations, the server is in a datacenter and when i made the installation it was using some sort of KVM (virtual keyboard and CDROM) as I red sometimes boot order can be important in this kind of situation the system is set to boot on the virtual CDROM when I connect one and when I disconnect the KVM I suppose it changes the boot order. however after reinstalling I made a lot of reboot to be sure it's stable... and it was. There were no important update, or i don't remember since my last reboot (only 8 days ago)If someone have an idea please please help
 
I am sorry for the spaces and returns, the forum seem to eat every return i made...
 
Hi, probably not related with your problem, but as far as I know, if you use ZFS you don't absolutely have to use hardware raid controller, ZFS needs direct control over disks! You need a HBA controller or a controller that does "true" JBOD well (they seem to be rare).
Areca has some HBA (I've ordered one to test), or you have to reflesh in "IT" mode a LSI controller that supports it (i.e. I think SAS 9211-8i, have a look at http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9211-8i in downloads section you find the "IT" firmware, but never tried myself)
Best regards
 
Hi !
Thank you for your answer but like you say i dont think either it's related to my problem but we never know...
Anyway my server is a dedicated server from OVH and i am not allowed to make changes on the RAID controller, maybe i could ask them but not sure it can be done and changing controller sure it's not possible...
I red that when you use ZFS you don't need to use a raid controller but since i have it and the is a cache memory with battery i think maybe it's not a bad idea to use it.

For information I have another server with the exactly same configuration and it seem to work correctly, but I can't try reboot to be sure because my VMs are on it and if it doesn't reboot it would be catastrophic for me :(

Best regards,
 
Hello Tom,i followed your advice, but when i click "Abort" button in the Proxmox graphic mode wizard, i don't get a shell. The system is automatically rebooted and no action is possible.How to get a shell to try the grub2 recovery?Best Regards,Bruno
 
I've solved this step, the installation should be aborted after License accept.

I've recovered grub, but after reboot I get an error like this:

filesystem 'rpool/ROOT/pve-1' cannot be mounted at '/root//mnt' due to canonicalization error 2.

Do you have some hints?

Bruno
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!