Serious PBS ZFS bug?

yurtesen

Member
Nov 14, 2020
38
5
13
With BIOS type system. If one upgrades the `rpool` to support zstd. The PBS does not boot anymore.
It says

Code:
error: unknown filesystem.
Entering rescue mode...
grub rescue>

Although I have done this operation on PVE boxes which used UEFI and had no adverse effects.

Furthermore, it is impossible to import the pool using PBS ISO, because it has the older kernel which does not support ZSTD.

Right now I am not sure if UEFI vs BIOS make any difference. But is this something known or ?
 
  • Like
Reactions: nak
Right now I am not sure if UEFI vs BIOS make any difference. But is this something known or ?
the difference between (legacy)BIOS boot and UEFI boot on ZFS for PBS (and PVE (since 5.4) and PMG) is that in the BIOS boot case grub (and its ZFS implementation) is used for booting - and grub's implementation does not have support for all ZFS features, especially not new ones like ZSTD booting

in the UEFI case systemd-boot is used and the kernel+initrd are read from the (VFAT formatted) ESP.

the PVE documentation has a bit of information about that:
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot

put shortly - if at all possible try to install ZFS only on UEFI enabled systems.
 
  • Like
Reactions: yurtesen
With BIOS type system. If one upgrades the `rpool` to support zstd. The PBS does not boot anymore.
It says

Code:
error: unknown filesystem.
Entering rescue mode...
grub rescue>

Although I have done this operation on PVE boxes which used UEFI and had no adverse effects.

Furthermore, it is impossible to import the pool using PBS ISO, because it has the older kernel which does not support ZSTD.

Right now I am not sure if UEFI vs BIOS make any difference. But is this something known or ?
Yes, this is already known for Proxmox VE. Sorry but there is not much you can do to fix this.
 
@Stoiko Ivanov you are assuming systemd-boot will always support all the ZFS features shipped with main kernel in perfect sync (is that guaranteed?). What if systemd-boot is also slightly behind? Effectively this makes it dangerous to enable ZFS features on Proxmox installations, including UEFI installations.

If you tried the Ubuntu installer. It creates a bpool in addition to rpool and I am guessing this is exactly the kind of eventuality they are trying to avoid... Actually I made a Ubuntu 20.10 installation and by default bpool is created with a bunch of missing features.

Code:
Some supported features are not enabled on the following pools. Once a
feature is enabled the pool may become incompatible with software
that does not support the feature. See zpool-features(5) for details.

POOL  FEATURE
---------------
bpool
      multi_vdev_crash_dump
      large_dnode
      sha512
      skein
      edonr
      userobj_accounting
      encryption
      project_quota
      device_removal
      obsolete_counts
      zpool_checkpoint
      spacemap_v2
      allocation_classes
      resilver_defer
      bookmark_v2

Maybe this should be an approach to be considered. It looks like many people had problems due to this incompatibility between GRUB and ZFS already. (and proof is at @avw 's link). Also, I feel like this shortcoming should be told to user by installer if BIOS is used. I think it would be nice to have it mentioned in documentation also, but documentation is not enough. This is a pretty critical issue I would say...

In my case... Unfortunately the server is a DL380p Gen8 so it has no UEFI. In addition, now I have no choice whatsoever as one can't downgrade the ZFS features and apparently grub does not understand new features...

So... I wonder what I should do now. It is pretty hard because there is no way to fix it apparently. I would have liked to re-install using a bpool for /boot but Proxmox installer does not give any options? I guess one way could be leaving some space at installation then create a bpool manually. That is almost something I would rather not do :)
 
Last edited:
It looks like many people had problems due to this incompatibility between GRUB and ZFS already.

Not that many, as all without problems do not post it, obviously.
Since 6.0, systemd-boot is used on ZFS/uefi, almost all servers support this and its the recommended way.

If you run totally outdated servers you are faced with many issues, therefore these server are so cheap.
 
@tom I do not want to start an argument but your comment is not productive. The hardware did not fail and a hardware is not "totally outdated" because it does not support UEFI. It is a backup server, it should not have to be latest high end machine either.

Yes sure, I was trying to re-purpose an older machine so it won't have to go to landfill. So I can help saving the environment, be green and help saving the planet.... Can you blame me for that? :)

I had only issue so far with that machine and it is this one. I understand it makes little sense for Proxmox to implement custom pool structure as a fix for old installations. But isn't it possible to just put a warning to improve user experience? maybe refuse to install /boot on ZFS unless user agrees to take risk on BIOS systems?

That said, I agree with you that ZFS seems like a bad idea I think I may stop using ZFS on boot disk on that hardware. I think 3 way RAID1 should be quite enough for all intents and purposes as a boot disk. As I now see that this situation is quite unfixable. So I would better remove all chances that it can ever happen again in that server.
 
I recently installed two used supermicro servers in production, looks like they are in Legacy mode. I'm on enterprise and haven't been offered this upgrade yet. But I've been reading I might be able to switch the bios safely to UEFI mode without having to reinstall, so I will try that... fingers crossed.
 
it should not have to be latest high end machine either.

That is a complete misunderstanding. To run a fast Proxmox Backup Server, you need a high end server with SSD storage, otherwise you will see performance issues.

The ZFS/Grub issues are already solved by using systemd-boot. With the upcoming 7.0 we re-evaluate to get grub better or replaced.
 
  • Like
Reactions: Stoiko Ivanov
Reading a bit through the openzfs repo, I found this:

Add "compatibility" property for zpool feature sets [2021 version] #11468:
https://github.com/openzfs/zfs/pull/11468

With the advent of zpool features, it is sometimes desirable to specify 'sets' of features to be applied to a zpool - for example, "all features supported by the current version of grub" and to limit the features applied by (eg) zpool create or zpool upgrade to not inadvertently enable unwanted features (in the case of grub incompatibility, this might result in an unbootable system).

Seems to be staged for openzfs 2.1.0
https://github.com/openzfs/zfs/releases/tag/zfs-2.1.0-rc1

Perhaps this could be part of the solution.
 
Last edited:
  • Like
Reactions: yurtesen
That is a complete misunderstanding. To run a fast Proxmox Backup Server, you need a high end server with SSD storage, otherwise you will see performance issues.

SSDs can be just be easily slided into the SFF slots on this machine if needed and this problem was not related to SSD/HDD. So I do not understand your point. But.. like I said, there iis no point in arguing abut this. I agreed already that with BIOS, the /boot in ZFS seems to be a bad idea.
But there is literally nothing else wrong with this hardware other than not supporting UEFI. I believe HP just did not want to introduce UEFI in that series even though other vendors used it already...

@janssensm that is an interesting PR. Hopefully this problem is eventually resolved.. Thanks for pointing it out.
 
I believe the Proxmox VE 6.x installers create ESP partitions (which are unused with GRUB) on each (mirrored) drive and you could store a copy of /boot there, or even make GRUB use those partitions when booting (mount /dev/sdX2 as /mnt/boot, cp /boot /mnt/boot/boot, grub-install /dev/sdX --boot-directory=.mnt/boot/boot, or something like that, please be careful). This will only work if you do this before rpool becomes unknown by GRUB.
 
Last edited:
  • Like
Reactions: yurtesen
@avw thanks, it is a good idea. Although I re-installed the effected machine with RAID1(adm) and EXT4 on it.
But I tried to test your idea in a VM now. There is an EFI partition of size 512MB created by proxmox installer. I converted that partition into EXT4 but grub-install still was saying unknown filesystem. In addition in grub rescue I am not able to do `ls (hd0,2)` it says the same... I think I need to study it more at some point :)
 
Hi,

Maybe a much simple ideea, is to use a dedicated PBS OS boot disk(ext4 i.e), so NO MORE Grub problems! Then use your other disks for zfs pool.

Good luck / Bafta !
 
  • Like
Reactions: yurtesen

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!