ZFS - error: no such device, error: unknown filesystem, Entering rescue mode

I wonder how the Proxmox installer of the next version with ZFS rpool (and GRUB) will boot without UEFI...
The current situation of two ways to boot Proxmox, with different configuration files, often comes up as part of the issues on this forum: Did you change the boot configuration? Which boot configuration do you use? How do you find out which boot configuration is used? This technicality is much more important than most users realize and it would appear that documentation and warnings about this are not reaching the (potentially) affected users. Every time ZFS has new features, people get excited and want to use them but even older features are dangerous to enable with GRUB, if they cannot be reverted once it is used by a single block anywhere on the rpool. And everyone affected only finds out afterwards and most of the time a reinstall was the way to fix it...
As an alternative, would it be possible to get GRUB to boot from the ESP partitions? Of course, this would require the ESP partitions to exists beforehand, and maybe not all required stuff is currently copied to those ESP partitions (yet)?
technically yes. once pve-efiboot is setup, kernels+initrd are available on each synced ESP, and you could point grub's stage 1 (which is in the bios boot partition) at a stage 2 that you put on the ESPs as well.. pve-efiboot would need to handle the grub config and install as well though, similar to how it handles the systemd-boot config now..
 
If you already have one or more ESP partitions (or can add one, possibly on another bootable drive), the following helped me to keep GRUB booting. Do not copy-paste it without understanding the commands and use it a your own risk!
mkdir -p /mnt/ESP && mount /dev/sda2 /mnt/ESP # Use your ESP partition instead of /dev/sda2 ! mkdir -p /mnt/ESP/boot && cp -r /boot /mnt/ESP/ mount --bind /boot /mnt/ESP/boot grub-install /dev/sda2 --boot-directory=/boot # Use your ESP partition instead of /dev/sda2 ! grub-update # Now boot entries should point to the ESP as well.
Now you have to keep the /boot folder on your ESP partition up to date with the /boot folder (by using the first two commands) forever...

EDIT: As @Ansy pointed out, the commands above only help you when you still have a running Proxmox and if you do it before new features go from enabled to active (or can move the disk to another up to date Proxmox, or maybe boot the next verion Proxmox ISO)! feature@zstd_compress, for example, goes from enabled to active when you set compress=zstd, even if you do so on a subvol of rpool and don't create any files or blocks. And it cannot be deactivated again according to the error message when you try to undo it.
 
Last edited:
  • Like
Reactions: janssensm
use Proxmox ISO to chroot into the existing rpool
You can not even export/mount existing ZFS pool because of actual Proxmox ISO does not support fresh ZFS features.

And also you can not rollback new ZFS features because of action clause below
Code:
root@server ~ # zpool status
  pool: rpool
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 05:09:24 with 0 errors on Sun Feb 14 05:33:26 2021
config:

        NAME        STATE     READ WRITE CKSUM
        hdd         ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0
            sdb     ONLINE       0     0     0

errors: No known data errors

grub is one of that software, unfortunately :( And of course, I've read changelog and found great zstd feature there, and zfs set compression=zstd rpool for my ZFS rpool (/ROOT/boot is also there) . After that get done, server was working for some days/weeks/months (activelly filling blocks with zstd compressed data) until real need to reboot... Ooops! You hardware hasn't UEFI at all, and how to rollback that so many blocks to lz4 now?

I think only by installing full Proxmox system on the additional drive without ZFS (lvm+ext4 by default), upgrade it till all new ZFS features supported, and then do other things to save your ZFS pool data. And after zpool import -f rpool you loose access to fresh installed ext4 /boot for reinstalling grub because of root=ZFS=/ROOT/pve-1 now
Code:
root@server:~# grep ROOT /boot/grub/grub.cfg
        linux   /ROOT/pve-1@/boot/vmlinuz-5.4.101-1-pve root=ZFS=/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet
        initrd  /ROOT/pve-1@/boot/initrd.img-5.4.101-1-pve
 
Last edited:
If I install proxmox without ZFS on a fresh drive, can I use the rpool on the existing drives (additionally)? Will the import destroy something of my new installation? Or can/have I to migrate the existing VM's to the new drive? Is there a FAQ or something similar where it is explained for people like me?
 
you'd need to import the pool without mounting (-N), and adapt the system-related datasets to be mounted somewhere else (or not at all). otherwise, ZFS will attempt to mount them over your new /, which will not work correctly.
 
  • Like
Reactions: Lueghi
@fabian: when I follow your mentioned way, then I can copy the VM's and the relatied configfiles to the new drive? And the VM's will run then from the new drive? Is there a step-by-step description available?

Many thanks for the support.

Stefan

PS: would support for this problem be included when I have subcription?
 
Last edited:
you can copy the data stored on ZFS after mounting/importing. but /etc/pve from the old system will not be directly readable (it's stored in a sqlite DB), so the config files are a bit tricky. if you don't have any backups where you can recover the configs from, it might be best to follow https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_recovery inside a VM
 
I've imported the old rpool with the -N option and that's what I see with lsblk -f from the rpool:
sdd ├─sdd1 ├─sdd2 vfat F514-4718 └─sdd3 zfs_member rpool 6143860967075433165 zd0 └─zd0p1 ext4 9215f5c1-2295-42e7-ac4d-1c5d1d098a62 zd16 ├─zd16p1 ext2 5a52b80b-4566-4814-8b0e-79ecb43a776f ├─zd16p2 └─zd16p5 LVM2_member SXkNmc-THj6-Jd0P-8DZa-fdWv-DArA-40ev1u zd32 ├─zd32p1 ├─zd32p2 ext4 8fcc45e9-121e-4f38-9bee-cdde869ffe17 └─zd32p3 LVM2_member GPxlus-pZBj-fXJn-nVJI-ZFrv-ePlA-tH6RHV zd48 zd64 ├─zd64p1 xfs b682b2ed-47b6-4ec4-bdad-1ddb8913b7f0 └─zd64p2 LVM2_member jzCL1T-GZqc-BBdI-Vn4v-kDae-WeQo-ReRTNb zd80 └─zd80p1 LVM2_member eRWwZb-hnMC-xSsF-icXp-ocel-Jde8-5sOh1c zd96 └─zd96p1 ext4 ce3682aa-8914-4e60-84ee-461117d1b011

I tried to mount all the partitions, ext4, ext2, xfs are fine but there I could not find my needed data. Only zd0p1 looks like a partition of an VM but this VM has no data I need. I tried to have a look at the LVM2_member with lvdisplay but I can only see the lvm volumes of my new installation. How can I get access to the lvm partitions and the partitions without a named filesystem of the rpool? I assume there I will find my data ....
 
I hope you don't take this the wrong way, but it does sound like you are in a bit over your head. it might make sense to get someone knowledgeable about PVE internals and lower-level Linux administration to guide you through this recovery, e.g. some of our partners might be able to help (https://proxmox.com/en/partners).

that being said, you might want to make a full disk backup first in case anything goes wrong. I will try to provide some more insight into what you are facing.

your rpool consists of filesystem datasets (which can be mounted) and zvols (which are presented as block devices). both can be copied, but the mechanism to do so is different:
- filesystem datasets need to be mounted first, then you can access them like any other filesystem (rsync, cp, ..)
- zvols are available as /dev/zdXX (and /dev/zvol/...), and can be accessed with any tool that works on block devices (dd, qemu-img convert, ...)

VM disks will be stored as zvols, to save them you can copy them to another block device (e.g., an LVM volume, and actual partition, ..) or into a raw image file, or convert it to a QCOW2 image with qemu-img. you don't want to activate the LVM groups and volumes on the partition!

zfs list should give you an overview of available datasets including zvols. with zfs set you can change dataset properties such as where filesystem datasets get mounted.
 
  • Like
Reactions: Ansy
@fabian: I think you are absolutly right with your assessment of my knowledge and my situation. I'm not so familiar with linux internals. I've configured a couple of servers in the past but I learned everything on the hard way :(. The last 5 years my system was running very well. I startet with proxmox while I had to upgrade my server and I ran into a problem. And I choosed zfs because all the information I got about zfs where very positive. That's the reason I started this way. And with proxmox I had the vision to make necessary updates for all the different tasks (in seperate VM's) of my server without breaking everything down. At the beginning everything worked very well and I had 4 VM running with my needs. I was really happy. Then suddenly I had the booting problem with zfs. And I found out that this happens rather often when you use grub. And I have to use grub because of my hardware.
I have installed zfs with mirrored drives so there ist already a copy :). And to get help from a partner is also a consideration I had. But how can I see who is the right guy for my problem. Every partner will have different experiences and no one will tell me "oh sorry, this problem I've never solved".
So thank you very much for your time you spend to my problem. Let's hope the best.
 
My system is up again. I've got help from an partner of proxmox. He moved the grub from the zfs to an separate ext4 partition. And I hope I will never run again in this fault. It was a very nice contact and I learned a couple of things. But I'm not able to tell you all the things he did, it is to much to remember from my mind ....

Stefan
 
I had the same issue. I felt it was easier to just reinstall the Proxmox OS since I had backups one day old of all my VM and LXC.

what is true command to disable the new features? What are the new features to disable? I did not see this information included in any previous posts. Also didn’t see that anywhere else on web.
 
As far as I've understood everything right I was told by the very compentent proxmox partner I contacted for help this can/will happen again when you use grub installed on a ZFS file system. The only way to prevent it for sure is to install grub on a non-ZFS partition. And when you use ZFS with mirrored drives you can mirror the separate boot partition too. You will not have the security of the ZFS file system but a mirror is better than nothing. And in case of a corrupted grub it is much more easier to repair a separate grub than the ZFS. That's what he did with my installation. And I deeply hope he will be right .... ;).
 
still no response to my question?

i have updated my ZFS and I'm afraid to reboot the system due to not being able to recover. Being colocated at a datacenter I can't access console easily ... anyone care to give some support?

I had the same issue. I felt it was easier to just reinstall the Proxmox OS since I had backups one day old of all my VM and LXC.

what is true command to disable the new features? What are the new features to disable? I did not see this information included in any previous posts. Also didn’t see that anywhere else on web.
 
still no response to my question?

i have updated my ZFS and I'm afraid to reboot the system due to not being able to recover. Being colocated at a datacenter I can't access console easily ... anyone care to give some support?
My experience is that you can usually safely upgrade your pools (but even safer is not running zpool upgrade, especially with GRUB) and try not to enable anything that can break GRUB. Most common: zfs set dnodesize to something other than legacy and zfs set compression=zstd on rpool or any file systems/subvol/zvol below it.
EDIT: Probably best not to upgrade your pools until there is a new Proxmox Installer ISO, with a ZFS version that is new enough to access upgraded pools, to work as a rescue disk if the system fails to boot.
 
Last edited:
  • Like
Reactions: janssensm
My experience is that you can usually safely upgrade your pools (but even safer is not running zpool upgrade, especially with GRUB) and try not to enable anything that can break GRUB. Most common: zfs set dnodesize to something other than legacy and zfs set compression=zstd on rpool or any file systems/subvol/zvol below it.
EDIT: Probably best not to upgrade your pools until there is a new Proxmox Installer ISO, with a ZFS version that is new enough to access upgraded pools, to work as a rescue disk if the system fails to boot.

Thanks. My concern is that my rpool was upgraded and changed compression to zstd already. Wanted to find out how to downgrade ... i don't recall what features were upgraded specifically.

So wondering solutions since Proxmox didn't include any safety measures keeping users from upgrading when it wasn't safe ... doubt I'm the only user in the situation where I was under the assumption Proxmox wouldn't be negligent in their releases.
 
  • Like
Reactions: Ansy
Thanks. My concern is that my rpool was upgraded and changed compression to zstd already. Wanted to find out how to downgrade ... i don't recall what features were upgraded specifically.
Upgrading itself is probably not a problem, the feature@zstd_compress becomes enabled and can then be used for file systems/volumes inside the pool. If zstd compression is (or was) actually used for anything under rpool the feature goes to active and GRUB definitely won't boot anymore. The feature cannot be set to disabled (only during creation of the pool) and I'm not sure if deleting all files/blocks that used it is enough to make zpool get all rpool no longer show active for that feature.
As you can see in earlier posts, some have had success with moving the /boot directory to another file system on another disk or partition and having GRUB boot from there. This is easier to do while you still have a working Proxmox.
 
  • Like
Reactions: Ansy
Upgrading itself is probably not a problem, the feature@zstd_compress becomes enabled and can then be used for file systems/volumes inside the pool. If zstd compression is (or was) actually used for anything under rpool the feature goes to active and GRUB definitely won't boot anymore. The feature cannot be set to disabled (only during creation of the pool) and I'm not sure if deleting all files/blocks that used it is enough to make zpool get all rpool no longer show active for that feature.
As you can see in earlier posts, some have had success with moving the /boot directory to another file system on another disk or partition and having GRUB boot from there. This is easier to do while you still have a working Proxmox.
It's a colo box; can't add/remove drives remotely. I need to be able to disable the zstd compression and any upgrades and revert back. Pretty poor form of Proxmox to release upgrades that breaks existing installs.
 
It's a colo box; can't add/remove drives remotely. I need to be able to disable the zstd compression and any upgrades and revert back. Pretty poor form of Proxmox to release upgrades that breaks existing installs.
I understand your need but disabling or reverting appears not to be supported by ZFS or Proxmox. As far as I know, you can only work-around this by moving the /boot directory to another partition. This is the only help I can offer as a fellow user (who has had the same problem years ago). If you cannot do this yourself, maybe you can show me the partitions on your drive(s) using gdisk -l?
 
Code:
pve-kernel-5.4 (6.4-1) pve pmg; urgency=medium
...
* proxmox-boot-tool: also handle legacy-boot ZFS installs
* proxmox-boot: add grub.cfg header snippet to log that the config is managed by this tool
...
-- Proxmox Support Team <support@proxmox.com>  Fri, 23 Apr 2021 11:33:37 +0200
Are that changes above addressed solving issue of this topic?
 
  • Like
Reactions: bgv2020

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!