ZFS grub rescue after reboot

dendi

Renowned Member
Nov 17, 2011
126
8
83
Hello,
I'm in this situation :
https://forum.proxmox.com/threads/crashes-with-zfs-root-and-stuck-on-grub-rescue-prompt.34172

No kernel upgrades, only a reboot after two weeks of uptime.
The server was rebooted fine several times during setup for tests and updates to latest pve-no-subcrtiption about two weeks ago.

But the grub output is different:
grub> insmod zfs
grub> ls (hd0)
Device hd0: No known filesystem detected
grub> ls (hd0,gpt1)
Partition hd0,gpt1: No known filesystem detected
grub> ls (hd0,gpt2)
Partition hd0,gpt2: No known filesystem detected
grub> ls (hd0,gpt9)
Partition hd0,gpt9: No known filesystem detected

The hardware is a DELL with PERC H700 raid hw (I know...) with a virtual volume of 12 TB (raid10 with 6 disks)
The virtual volume is configured as raid0 zfs with one disk.

I can mount rpool booting with proxmox cd and see all data, status is ok.
I tried to: import, chroot, update-initramfs -u, update-grub and grub-install /dev/sda without errors and nothing changed.
grub-probe -v (chrooted) show te correct size: 23xxxxxx sectors

This server is not in production so I can make some test for now and I'd like to solve without reinstall.

Please reply if you want other information or if have some suggestion, thank you very much
 
Hello,
I'm in this situation :
https://forum.proxmox.com/threads/crashes-with-zfs-root-and-stuck-on-grub-rescue-prompt.34172

No kernel upgrades, only a reboot after two weeks of uptime.
The server was rebooted fine several times during setup for tests and updates to latest pve-no-subcrtiption about two weeks ago.

But the grub output is different:
grub> insmod zfs
grub> ls (hd0)
Device hd0: No known filesystem detected
grub> ls (hd0,gpt1)
Partition hd0,gpt1: No known filesystem detected
grub> ls (hd0,gpt2)
Partition hd0,gpt2: No known filesystem detected
grub> ls (hd0,gpt9)
Partition hd0,gpt9: No known filesystem detected

The hardware is a DELL with PERC H700 raid hw (I know...) with a virtual volume of 12 TB (raid10 with 6 disks)
The virtual volume is configured as raid0 zfs with one disk.

I can mount rpool booting with proxmox cd and see all data, status is ok.
I tried to: import, chroot, update-initramfs -u, update-grub and grub-install /dev/sda without errors and nothing changed.
grub-probe -v (chrooted) show te correct size: 23xxxxxx sectors

This server is not in production so I can make some test for now and I'd like to solve without reinstall.

Please reply if you want other information or if have some suggestion, thank you very much

can you repeat the "ls" test with setting "set debug=zfs"? it is likely the HW raid that messes up though..
 
Ok

ls show 3 + 3 partitions:

partitions.jpg



This is the log for gpt1:

gpt1.jpg



then gpt2:

gpt2.jpg



and finally gpt9:

gpt9.jpg
 
Hello,

I did try to save the serial output, I hope blank lines are not missed lines.
I hope you will understand something useful

In the meaning time, I need to bring up this server, can you suggest a way to boot via usb key or alternative method, so I can continue the debug without reformatting.

Thank you for your support
 

Attachments

  • boot700.log
    7.2 KB · Views: 18
there are obviously lines missing and/or corrupted..

you can work around the issue by moving /boot to a non-ZFS partition on a drive which is bootable, and install grub there.
 
Ok, this is another grab.

Please take a look and let me know.

Meanwhile, I tried to make a usb pen with ext2 partition and boot directory, chroot on not bootable system and grub-install to usb key but it does not work, can you explain better?

Then I installed a fresh proxmox and same updates on USB pen with LVM, then I changed kernel parameters in /boot/grub/grub.cfg from
Code:
linux   /boot/vmlinuz-4.13.13-6-pve root=/dev/mapper/pve-root ro  quiet
to
Code:
linux   /boot/vmlinuz-4.13.13-6-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet
and then grub-install. Finally system seems to boot with old system but I need to check deeply if all is ok. Do you think is a good solution?

Thank you for your support.
 

Attachments

  • zfs.txt
    6.9 KB · Views: 24
that output unfortunately does not tell us much (expect that Grub does not find a valid ZFS). you basically need to move all the /boot content into a separate partition that you mount as /boot, then run grub-install.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!