[SOLVED] Grub + dnodesize = auto => grub rescue, unknown filesystem

H25E

Member
Nov 5, 2020
68
4
13
33
Hello,

I have a proxmox (v5.4 I think) machine installed on ZFS, including the root filesystem. Then, quoting proxmox wiki, it should have been installed with systemd-boot instead of grub2.

Proxmox VE currently uses one of two bootloaders depending on the disk setup selected in the installer. For EFI Systems installed with ZFS as the root filesystem systemd-boot is used. All other deployments use the standard grub bootloader.

The thing is that the machine has grub2 (don't know why, I have inherited it like that) and after change to dondesize = auto, on the next reboot grub has been unable to read the disks.

Code:
error: no such device: 40d7d14f38cc...
error: unknown filesystem
Entering rescue mode...
grub rescue>

As you can see here it's a known issue.

How I can delete grub2 and install and configure systemd-boot to boot from the already existant rpool?

Thanks for your time, best regards,

H25E
 
Grub gets installed as fallback.

systemd-boot is used in efi mode, grub for legacy boot.

It's a bios setting, select uefi/efi boot mode instead of legacy.
 
Didn't know that grub was installed as fallback, but on the bios Storage Boot Option Control = UEFI and Other PCI devices = UEFI.

What more I can do? For me seems that systemd-boot isn't installed. How can I check if it is?
 
That's odd.

Proxmox dosn't keep /boot/efi mounted after boot.

But "efibootmgr -v" should list the efi entries if they exist.

Should print the disk gpt partition id like:
Code:
Boot0000* Linux Boot Manager    HD(2,GPT,XXXXXX-XXXX-XXXX-XXXX-XXXXXXXX,0x800,0x100000)/File(\EFI\systemd\systemd-bootx64.efi)

You can rewrite the efi bootloader with "pve-efiboot-tool refresh"
 
Yes, it's strange.

If I disable CMS on the bios the system doesn't boot. Instead, it opens BIOS again. If I order to exit without saving it restarts and opens BIOS again. And that's it.

With CMS enabled and Storage Boot Option Control = UEFI on BIOS settings I have the GRUB error. So I can't boot into the system to run efibootmgr.
 
I ran into the dondesize=auto problem some time ago and know how it feels. At the time, there was no systemd boot support at all...

Does your UEFI support selecting the boot device when starting? Do you see two options for each disk (one UEFI and one Legacy or Linux Boot Manager or something)?
If there are UEFI entries, try them and run pve-efiboot-tool refresh after booting from UEFI. I don't think it can set the boot entry when booting with GRUB.
Does the drive have a GPT or only an MBR? To be able to boot using systemd you need an ESP partition and use GPT.
 
Thanks for thje empathy hehe...

In the UEFI settings it's selecected CSM Support = ENABLED. Is like it has been all the time. (CSM suppor means BIOS/Legacy support mode). Like that the system was booting before the dnode problem and now brings to the GRUB error.

Now, if I change to CSM Support = DISABLE the system reboots continously and access UEFI automatically without booting into anything. What does this mean?

So I suspect that this isn't an UEFI installation... (Wasn't installed by me). But if I don't remember bad, I think there were EFI partitions... I also believe that disks are GPT, but don't know how to check right now from UEFI.

Anyway, I have installed same proxmox version into a pendrive and I'm going to boot from it and rewrite the rpool partition. I think it's risky but don't know what other thing to do.

What would you do?
 
Now, if I change to CSM Support = DISABLE the system reboots continously and access UEFI automatically without booting into anything. What does this mean?
It would appear that booting from UEFI is not setup correctly on your disks.
Does your UEFI have a hotkey to boot from another device (CD or pendrive or such)? Does it give you any options?
So I suspect that this isn't an UEFI installation... (Wasn't installed by me). But if I don't remember bad, I think there were EFI partitions... I also believe that disks are GPT, but don't know how to check right now from UEFI.
You could try booting a Linux Live CD/USB to check. Ubuntu Desktop 20.04 installation ISO can boot you into a desktop and has ZFS support.
Anyway, I have installed same proxmox version into a pendrive and I'm going to boot from it and rewrite the rpool partition. I think it's risky but don't know what other thing to do.

What would you do?
I installed a fresh Proxmox on a pendrive and used that to boot. During the start Proxmox would get confused about the two rpools. I think I manually chose the rpool on hard disks and later renamed the rpool on the pendrive (which could still boot). Then I used to keep the pendrive in sync with my rpool (with was less than 15GB) and make the system boot from that. I think I only needed to sync the /boot directory, as that was all that GRUB needed to read before switching to the hard disk rpool.

In theory you could boot from the pendrive, update that rpool with the data from your hard disk rpool and then overwrite the hard disk partition with the pendrive. I decided that, once Proxmox supported booting off ZFS with UEFI, it was easier to reinstall (to use systemd-boot) and recover the VMs (from a different ZFS pool or from backup).

Given that you seem to have inherited an older Proxmox (without documentation), I would suggest try booting from pendrive once and see if it gets the original installation working (by choosing the hard disk rpool by number). That should give you an opportunity to backup and document everything safely.
And then do a new Proxmox 6.3 installation (on other hardware or disks, if possible, to prevent accidental erasure) and restore the VMs there.
 
I have booted with USB flash proxmox and confirmed that ins't a UEFI installation. sda1 is a 1MB "bios boot" (literally fdisk type) partition, instead a 500MB UEFI one.

So, no option to bypass grub limitation booting from systemd-boot...

I think I'm going to copy all the rpool content to a second bigger pool that the server has, and then bring it back.

I know it would be cleaner to start with another server-disks and proxmox 6.3 but it's not a good moment for do it now.
 
I have booted with USB flash proxmox and confirmed that ins't a UEFI installation. sda1 is a 1MB "bios boot" (literally fdisk type) partition, instead a 500MB UEFI one.
If you install a new Proxmox withj UEFI and ZFS , it will also create a 1MB bios boot partition but also second a 512MB FAT32 ESP partition (sda2). I think fdisk does not detect GPT. Use gdisk instead of fdisk will show you if GPT is used.
If you have spare disk space, you could create a ZFS pool (with some name), copy/rsync rpool, remove the old rpool and rename the new pool to rpool. You do need to boot a LiveCD with ZFS support in order to remove or rename rpool.
 
I have done the same with gdisk and the outputs are the same. There is the bios partition but not the ESP.

I have enough spare disk space. But instead of rsync I have done the following (after changing dnodesize to legacy of course):

Code:
zfs snapshot rpool/ROOT/pve-1@grub-error
zfs send -Rv rpool/ROOT/pve-1@grub-error | zfs receive -F rpool/ROOT/pve-backup
zfs unmount rpool/ROOT/pve-1
zfs unmount rpool/ROOT/pve-backup
zfs destroy rpool/ROOT/pve-1@grub-error
zfs destroy rpool/ROOT/pve-backup@grub-error
zfs destroy rpool/ROOT/pve-1
zfs rename rpool/ROOT/pve-backup rpool/ROOT/pve-1
zfs set mountpoint =/ rpool/ROOT/pve-1

And nothing changed... Maybe the send/receive snapshot method hasn't done a real rewrite of the files? And copied them with with the big dnodes?

EDIT: Did it with rsync with same results!!! ... dissapointing.

The only other things that waste space on the pool are the few kb of the empties rpool, rpool/ROOT, rpool/data and some zvols. They where set to dnodesize auto also, but they are not real files! They can't be the source of the problem, isn´t?.

Right now I have 0 idea of what to do next.
 
Last edited:
ZFS send-receive duplicates the rpool including the dnodes that GRUB cannot handle (sorry for not warning you about this). I believe any other filesystem and zvol can cause dnodes of a different size to occur in the metadata of the rpool, and cause GRUB to fail to read. The irony is that there has been a fix for years but it is waiting on review and testing.

I think you need to create a fresh new pool on a new vdev and create a new ROOT, ROOT/pve-1 and data filesystem with the same settings as the old ones (except dnodesize). Copy the data from / to newpool/ROOT/pve-1 using cp -aR or rsync and then rename the old pool (to keep access to the zvols there) and rename the new pool to rpool. Boot Proxmox, add the old pool (with the new name) as a Storage, and then move the virtual disks over to the new rpool using the web GUI.
I hope this will help fix it, but I cannot guarantee that I did not forget any details.
 
You don't have to say sorry. Thanks for your help and time.

I can't add more disks to the system. (A modest server on consumer hardware with all sata ports used). But I have a bigger and secondary pool on the system, I can use it to copy everything from rpool to secondary-pool. Then completely delete all contents of rpool, (or maybe delete the pool itself and create it new) and finally bring back all the contents from secondary-pool to the new rpool and adjust names and mountpoints.

The problem is, how I copy the zvols from rpool to secondary-pool? The only methods I have found are send/receive and dd, and both would propagate the dnode problem. Import them from the webgui wouldn't propagate the dnode properties? Why it's different? Could I do a send/receive to secondary-pool and then import them with the GUI and dnode size will be changed to legacy¿

On the other hand, where it is stored the zvol metadata? Maybe we could rewrite only it like rpool/ROOT/pve-1?
 
There is also something strange. From the grub rescue console I can't read neither the mirrored parition neither the BIOS parition.

The rpool is composed from two disks that they are printed like that with grub rescue> ls:
(hd0) (hd0,gpt9) (hd0,gpt2) (hd0,gpt1) (hd1) (hd1,gpt9) (hd1,gpt2) (hd1,gpt1) ...

gpt 9 is the solaris reserved, gpt2 the data partition and gpt1 the bios partition. Shouldn't be gpt1 readable by grub? Or can it also be affected by dnodesize problem?
 
The first partition (1MB bios boot) contains the binary executable code of GRUB and is not readable as a filesystem, this is normal.

You need to rewrite all ZFS disk blocks without dnodesize=auto, which I believe is impossible. Therefore you need to create a fresh new rpool and copy everything there.
Maybe you can use the web GUI to move the virtual disks (zvols) stored on rpool to another pool on the machine (by booting from the pendrive as describe earlier)? If the rpool is all the storage you have on the machine, you cannot easily do that.

How about removing half of the raid1/mirror (let's say hd1-gpt2), create a new ZFS pool (on hd1-gpt2, let's say newpool) without dnodesize=auto, create new ROOT and ROOT/pve-1 and data (on newpool), copy everything from rpool/ROOT/pve-1 (to newpool/ROOT/pve-1), rename rpool to oldpool and rename newpool to rpool. In principle GRUB should be able to boot the new rpool and you can add oldpool as a Storage in the web GUI and move all virtual disks. If everything works, you can erase the oldpool and add the vdev (hd0-gpt2) to the new rpool as a mirror (and not as an extension! check command carefully). Maybe it is worth the risk to run without radi1 temporarily (run a zpool scrub rpool first!).

I would boot the old Proxmox using pendrive, make backups of all VMs and containters (to a directory on rpool) and run a scrub and remove all VMs and containers. Then remove one of the disks from the machine, wipe the other one and install Proxmox 6.3. on it with UEFI and systemd-boot Then add the other disk and restore the VMs and containers from backups (from the old rpool, which you need to import by another name). And them wipe the disk containing the old stuff and use it as a mirror again.
 
But as I said in my message 13 (that maybe you haven't read because I double posted, sorry):

The problem is, how I copy the zvols from rpool to secondary-pool? The only methods I have found are send/receive and dd, and both would propagate the dnode problem. Import them from the webgui wouldn't propagate the dnode properties? Why it's different? Could I do a send/receive to secondary-pool and then import them with the GUI and dnode size will be changed to legacy¿

I mean, zvols can't be rewrited with cp or rsync, they are moved as a block. If importing the zvols again to the new pool (with send/receive, dd or GUI) I propagate the old properties, the new pool will go unreadable again, isn't?

Thanks for your patience.
 
But as I said in my message 13 (that maybe you haven't read because I double posted, sorry):



I mean, zvols can't be rewrited with cp or rsync, they are moved as a block. If importing the zvols again to the new pool (with send/receive, dd or GUI) I propagate the old properties, the new pool will go unreadable again, isn't?

Thanks for your patience.
Good point! I think that when you use the move disk from the web GUI, but I am not sure. Making a backup and then restoring the VM on the new pool will work for sure.
 
It's crazy.

I have deleted all datasets except the base rpool and still the same grub error message....

I'm going to do a fresh UEFI installation. Any advice to do an easier configuration of the new proxmox from the old one? I guess I can't directly overwrite the /etc/pve folder with the older one.
 
The feature is still turned on (and cannot be disabled probably) and therefore GRUB cannot read/detect the rpool. As stated before, you need to create a fresh new pool.

My advice is to not use the whole disk for rpool but only 14GB or so (you can enter it in the installer). And to create an additional partition for a separate pool for the VMs.
I also advise to remove one of the hard disks and keep them for reference later and do a fresh install on the other disk drive.
 
I have done it this weekend and for the moment it's working well.

  1. Back up all the zfs datasets on rpool to a secondary pool, including ROOT/pve-1 (snapshot + send | receive works good for filesystem datasets and zvols)
  2. Reinstalled proxmox to the lastest available version (with a smaller rpool, and creating a second zpool with the remaining space)
  3. Bring back the backuped datasets (except ROOT/pve-1)
  4. Take the vm/lxc configuration files from the backuped ROOT/pve-1, edit them to meet the new proxmox storages names if needed, and copy them to the new /etc/pve/qemu-server|lxc/
All this because I was mistaked and it was a BIOS/legacy installation, and systemd-boot was unavailable for me.

Thank you so much @avw and @H4R0 for your help.