ZFS - error: no such device, error: unknown filesystem, Entering rescue mode

Code:
pve-kernel-5.4 (6.4-1) pve pmg; urgency=medium
...
* proxmox-boot-tool: also handle legacy-boot ZFS installs
* proxmox-boot: add grub.cfg header snippet to log that the config is managed by this tool
...
-- Proxmox Support Team <support@proxmox.com>  Fri, 23 Apr 2021 11:33:37 +0200
Are that changes above addressed solving issue of this topic?
Great question ... lets see if anyone from Proxmox cares to respond ..............................
 
yes. new installes (PVE 6.4) that use legacy booting and ZFS will get Grub installed on all boot disks, using kernels/initrd/stage 2 from the corresponding ESP. the ESP content will be kept in sync and updated when new kernels are installed, update-grub/update-initramfs is called manually, etc. the /boot directory in the rpool is no longer used for booting, but just as source for kernel/initrd images which are copied to the ESPs.

a howto for switching to this boot mode will be made available soon.
 
@fabian thanks for the info above. Just to confirm if my understanding is correct:

1. New 6.4 installations with ZFS will no longer have any issues with zpool upgrades and issues with GRUB on reboots - by default?
2. I'm assuming the howto on switching to this new boot mode will indicate whether this be seamless on 6.3 systems that are being upgraded and that are RUNNING, and will not require any reboots?

Thanks for your ongoing support and insights - and a great platform :)

Angelo.
 
New 6.4 installations with ZFS will no longer have any issues with zpool upgrades and issues with GRUB on reboots - by default?
Exactly. In new 6.4 based installation Proxmox VE makes GRUB boot from the quite easy to handle ESP vfat partition to avoid any issue booting from ZFS directly, that is then done by the initial Kernel + initramfs disk which uses the same upstream OpenZFS code base the actual system will use.

One may still want to wait upgrading the rpool, as and sometimes one needs to boot into an older kernel (if the newer one has some independent regressions) and said older kernel may not know about all features yet. But that is nothing new and something you can run into with any upgrade including new features in a filesystem that older versions cannot understand, and in general way better than depending on the external ZFS support in GRUB.

So, a general applicable process to make a bigger upgrade safer to do could look like:
  1. Upgrade to new ZFS + Kernel
  2. See if the new kernel works OK for a week or so, most regressions show up much quicker, so the extended period is just to be safe
  3. Only then do the ZFS pool upgrade.

2. I'm assuming the howto on switching to this new boot mode will indicate whether this be seamless on 6.3 systems that are being upgraded and that are RUNNING, and will not require any reboots?
Yes, it's meant for a live and running system and should not interrupt any guest/storage workload.
A single reboot is required to activate the new way and to ensure the switch worked, but it is not necessary to do that immediately.
 
Last edited:
  • Like
Reactions: fabian and Ansy
FYI: https://pve.proxmox.com/wiki/ZFS:_Switch_Legacy-Boot_to_Proxmox_Boot_Tool

It should cover both: graceful switch now, and repair on error recovery.

As always, take care, avoid working rushed and ensure you got the right partitions when doing stuff like formatting them.
The steps are relatively simple, but if you're unsure in any way I'd recommend test it through first, either in a test-lab or in a virtualized PVE instance - better safe than sorry.
 
  • Like
Reactions: Ansy
Thnx Thomas,

I can confirm the process works just fine.. :)

Just a very small suggestion to add to the notes (it's not immediately obvious): you have to do an upgrade to 6.4 first before trying to run proxmox-boot-tool and make sure NOT to do a 'zpool upgrade' yet (last step).

Stating the obvious but I think that makes it idiot (me) proof..

Best,

Angelo.
 
  • Like
Reactions: t.lamprecht
Just want to thank the proxmox team for finding a solution and fixing the legacy bios and grub. I am sure many have older systems that don't have UEFI boot but are using ZFS on the rpool.

This all works great.

I actually fresh installed 6.4 as it was an easy alternative in my scenario.
 
I can confirm the process works just fine.. :)
Thanks for your feedback!

you have to do an upgrade to 6.4 first before trying to run proxmox-boot-tool
I added that now.

and make sure NOT to do a 'zpool upgrade' yet (last step).
Here it was not clear to me where you meant, maybe I just overlooked it. Can you please tell me where exactly you would place that to be helpful?
 
  • Like
Reactions: Ansy
To clarify. A fresh install of 6.4 can safely boot with features like zstd enabled active??
 
Last edited:
I've read the doc but I'm not sure which partition to format to switch to proxmox-boot-tool on an old server that was upgrade to 6.4. Can I use sda1 ?

Code:
root@proxmox-3:~# lsblk -o +FSTYPE
NAME      MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT FSTYPE
sda         8:0    0 372.6G  0 disk
├─sda1      8:1    0  1007K  0 part
├─sda2      8:2    0 372.6G  0 part            zfs_member
└─sda9      8:9    0     8M  0 part

Code:
Disk /dev/sda: 372.6 GiB, 400088457216 bytes, 781422768 sectors
Disk model: X438_1625400MCSG
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: gpt
Disk identifier: 9D43C1CC-0290-44ED-A5A7-385F9E770DB8

Device         Start       End   Sectors   Size Type
/dev/sda1         34      2047      2014  1007K BIOS boot
/dev/sda2       2048 781406349 781404302 372.6G Solaris /usr & Apple ZFS
/dev/sda9  781406350 781422734     16385     8M Solaris reserved 1

I'm assuming after following the procedure my server will need to boot in UEFI mode from the BIOS ?
 
Last edited:
I've read the doc but I'm not sure which partition to format to switch to proxmox-boot-tool on an old server that was upgrade to 6.4. Can I use sda1 ?

Code:
root@proxmox-3:~# lsblk -o +FSTYPE
NAME      MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT FSTYPE
sda         8:0    0 372.6G  0 disk
├─sda1      8:1    0  1007K  0 part
├─sda2      8:2    0 372.6G  0 part            zfs_member
└─sda9      8:9    0     8M  0 part

Code:
Disk /dev/sda: 372.6 GiB, 400088457216 bytes, 781422768 sectors
Disk model: X438_1625400MCSG
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: gpt
Disk identifier: 9D43C1CC-0290-44ED-A5A7-385F9E770DB8

Device         Start       End   Sectors   Size Type
/dev/sda1         34      2047      2014  1007K BIOS boot
/dev/sda2       2048 781406349 781404302 372.6G Solaris /usr & Apple ZFS
/dev/sda9  781406350 781422734     16385     8M Solaris reserved 1
no - your install is too old and doesn't have an ESP at all. depending on your level of familiarity with ZFS and lower level tools like sgdisk (and the redundancy of your rpool), you can either
- completely reformat one of your vdevs to have an ESP (invasive, potentially dangerous, temporarily reduces redundancy)
- format a spare disk/USB drive/.. as ESP and use that to boot
 
I've read the doc but I'm not sure which partition to format to switch to proxmox-boot-tool on an old server that was upgrade to 6.4. Can I use sda1 ?
No, that is too small. Your old server was probably installed with Proxmox VE ISO before 5.4, which did not create the partition required here.

I'm assuming after following the procedure my server will need to boot in EFI mode from the BIOS ?
No, the whole idea is to keep the used method, as not all server can actually boot from UEFI.
 
Note, it's still also an valid option to keep the rpool as is, and not run zpool upgrade on that rpool at any time - your setup would boot fine with that.
 
Note, it's still also an valid option to keep the rpool as is, and not run zpool upgrade on that rpool at any time - your setup would boot fine with that.
Yeah unfortunalety I did an rpool upgrade already. So I'm assuming my best bet at this point is to reinstall said node using the ISO ?
 
Adding another disk and configuring that as initial boot device is also an option...
The tutorial would be quite similar, instead of searching an available partition on the ZFS devices you would partition a new disk with ~512MiB and use that (the remaining disk can be used as anything).

You could also use a USB pen drive or the like, as Fabian mentioned, but note that the reliability of those may not be so good (albeit kernel updates are not that frequent), so having a backup drive or the like would be really recommended in that case.
 
Just want to add that we also hit this problem after doing recent apt-get update/upgrade on one of the servers and doing "zpool upgrade -a" afterwards.. I only want to say that this is very serious and somewhat undocummented behavior that can cause a lot of problems to sysadmin, especially when your server is colocated somewhere outside of your residence country. I think some serious warning should be placed where possible to minimize number of people hitting this problem.

And my very personal opinion - EFI/UEFI is absolute evil technology that should not be used unless you know what you're doing and how to cook it. Unfortunately many even server-grade systems supply hardware servers with EFI/UEFI enabled by default in BIOS and many people don't even notice this until they get in trap like the one with GRUB and ZFS upgrade.

Good luck everyone!
 
Just want to add that we also hit this problem after doing recent apt-get update/upgrade on one of the servers and doing "zpool upgrade -a" afterwards.. I only want to say that this is very serious and somewhat undocummented behavior that can cause a lot of problems to sysadmin, especially when your server is colocated somewhere outside of your residence country. I think some serious warning should be placed where possible to minimize number of people hitting this problem.
It is documented prominently in the 6.4 known issues section: https://pve.proxmox.com/wiki/Roadmap#6.4-known-issues
which in turn is linked to in our release threads.

See here for the full guide about how to switch away from booting directly via ZFS and note that this is for both UEFI and BIOS:
https://pve.proxmox.com/wiki/ZFS:_Switch_Legacy-Boot_to_Proxmox_Boot_Tool

And my very personal opinion - EFI/UEFI is absolute evil technology that should not be used unless you know what you're doing and how to cook it. Unfortunately many even server-grade systems supply hardware servers with EFI/UEFI enabled by default in BIOS and many people don't even notice this until they get in trap like the one with GRUB and ZFS upgrade.
Note that the problem at hand has absolutely nothing to do with UEFI, and also not switching to or from UEFI. The only reason UEFI is mentioned at all is because in Proxmox VE ZFS was always already switched over to the new boot way when UEFI was used, using an intermediate, quite universal, FAT partition to boot to ZFS in that case, as ZFS access is not available in any FW (which is a good thing, no need to bloat them with dozens of stuck filesystem support implementations).

Besides that, both UEFI and legacy BIOSes are just firmwares after all, with UEFI being standardized a lot more and an actual way of setting boot entries in the FW which improves several things like multi-boot or allowing to avoid, yet another boot stage like most bootloaders are.
 
It is documented prominently in the 6.4 known issues section: https://pve.proxmox.com/wiki/Roadmap#6.4-known-issues
which in turn is linked to in our release threads.
Thanks for pointing. I wanted to say that the warning could be placed somewhere during apt-get dist-upgrade (from lets say Proxmox 6.3 to Proxmox 6.4) - like apt-get stops and acknowledges sysadmin regarding this potential issue.

See here for the full guide about how to switch away from booting directly via ZFS and note that this is for both UEFI and BIOS:
https://pve.proxmox.com/wiki/ZFS:_Switch_Legacy-Boot_to_Proxmox_Boot_Tool
I actually followed this excellent howto to fix boot loader.

As to EFI itself, perhaps I'm very conservative person in this area and personally I switch servers to Legacy boot when I have access to OS installation process in the first place.

cheers!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!