[SOLVED] Proxmox 6 EFI Boot ZFS Rpool Race to Import Fail

armouredking

Member
Jul 3, 2016
12
1
21
39
Hey all,

Running into some problems getting the system to work properly now that release is out and I want to make a ZFS Mirror instead of just a single drive. For clarity, the system showed no issue booting using ext4 or xfs on NVME or SATA SSDs (if used by themselves) but now that I'm attempting to use 2 sata SSDs as a mirrored rpool for PVE I'm getting a race condition on boot that I cannot seemingly correct.

In the old days you added rootdelay=x to your Grub file, but that does not seem to have an effect for me. I'm unsure how Grub works with EFI boot, and this is setup for EFI boot - the initial boot loader looks like this:

upload_2019-7-23_9-44-5.png

Here's the problem occurring (looks familiar):
upload_2019-7-23_9-37-29.png

As one can clearly see, the necessary drives are not connecting / loading fast enough. This is repeatable (every single boot does this regardless of Grub rootdelay setting, no matter how ridiculous the setting it just zips along as fast as can be).

Manually typing in the required setting and then exit allows for a proper boot sequence most of the time:

upload_2019-7-23_9-39-44.png

upload_2019-7-23_9-40-9.png

But sometimes you get shenanigans:

upload_2019-7-23_9-54-59.png

Changing the command to zpool import -f -N rpool will correct this annoyance.

This is obviously undesirable in an intended remote setup. I have attempting a fresh reinstall just to see if it would make a difference using different disks but no luck. I have also tried using the "rootwait" setting vice "rootdelay" but there is no change in behaviour. What setting needs to be configured to give the system enough time to load the disks before attempting zpool import at boot?

System Info:
CPU(s): 64 x AMD EPYC 7551P 32-Core Processor (1 Socket)
Kernel Version: Linux 5.0.15-1-pve #1 SMP PVE 5.0.15-1 (Wed, 03 Jul 2019 10:51:57 +0200)
PVE Manager Version: pve-manager/6.0-4/2a719255
 
  • Like
Reactions: i_am_jam
So found this article (link) on StackExchange that talks about changing the ZFS_INITRD_POST_MODPROBE_SLEEP='0' line. I attempted to edit both it and the preceding line:

Code:
# Wait for this many seconds in the initrd pre_mountroot?
# This delays startup and should be '0' on most systems.
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_PRE_MOUNTROOT_SLEEP='10'

# Wait for this many seconds in the initrd mountroot?
# This delays startup and should be '0' on most systems. This might help on
# systems which have their ZFS root on a USB disk that takes just a little
# longer to be available
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_POST_MODPROBE_SLEEP='10'

Following an update-initramfs and update-grub, I still wind up with no delay occurring at boot and the import failing:

upload_2019-7-23_11-52-45.png



Think I've exhausted all I know how to do and Google's not giving me much aside from this. I suppose step would be to ditch EFI boot and see what happens then. Would prefer to know a workaround or fix for EFI if it exists though.
 
Hmm - PVE 6 started supporting booting ZFS on EFI. For this PVE uses systemd-boot instead of grub - this means that running `update-grub` does not help in that case.

Please try adding 'rootdelay=10' in /etc/kernel/cmdline (at the end of the text that's already there - the file must contain a single line)
and run `pve-efiboot-tool refresh`

see also 'Troubleshooting and Known issues' -> ' Boot fails and goes into busybox' at https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks
(this has not yet been updated to include booting zfs from efi).

Hope this helps!
 
  • Like
Reactions: Kootaro
That did indeed help.

Additionally, for those that find this thread via Google, I was using a Supermicro H11SSL-NC board. Booting from the SATA connectors or the x4 PCI slot works without issue, the onboard SAS 3008 chipset is the issue here. I'm using the 846XA chassis so the distance between the SSD caddies and the motherboard is very long, and I didn't trust regular SATA cables for this purpose. I'd prefer to use the mini-SAS couplers as in my experience they are built with higher standards in mind.

I've tried flashing it a few different ways but it doesn't seem to boot fast enough for some reason, hence the need for this configuration option. Changing the options in BIOS for the chipset had no effect (attempting to set Legacy BIOS or boot disk). For those like me who are coming into this board from older tech, the 3000 series of LSI has EFI options and at least with the H11SSL-NC board, when you enable EFI bios on those cards you will lose the typical LSI configuration screen during boot and it will instead be replaced with an option in the BIOS to manipulate the LSI controller directly. Also boots a lot faster that way.
 
Thanks for the feedback that it worked! Always good to hear back especially about rather new features!
Please mark the thread as SOLVED, so that others know what to expect - Thanks!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!