grub2-2.02-pve5 breaks on GPT/EFI disks (like ZFS roots)?

spudger

New Member
Nov 3, 2015
5
0
1
Hi.

We run our Proxmox hosts in ZFS RAID10 (on SSDs, not that that's especially relevant).

Typically, I install to a ZFS stripe from the Proxmox installer. This is nice and simple - however, afterwards, only one disk has an EFI BIOS partition (and thus is the only bootable disk - SPOF anyone?) - and the partitions are laid out differently (ZFS member/overflow partitions different start/end blocks).

This seems like an obvious improvement/feature request for the installer, but it's relatively simple to address manually.

I lay out a (currently unused) disk the way I want (e.g. with 3 partitions, first an EFI Boot partition #1, the ZFS member #2 and ZFS overflow #9). And then:
  • sgdisk clone the disk layout to any others, and regen the UUIDs
  • "zpool replace" the original disk(s) with the newly-frobbed ones (actually, the ZFS member partition, rather than the raw disk) to get them into the pool, wait for resilver to complete
  • "zfs labelclear" the (now) unused disks,
  • sgdisk clone/regen again (target the newly unused disk),
  • followed by "zpool attach" them - actually the zfs member partition, rather than the raw disk - back into the pool as mirrors.
And then finish up with a grub-install to all the disks, so any/all of them are bootable - ideally, test boot from each before we put VMs on it. Because it's tough to predict which disk may fail.

This has worked well for 18 months or so.

Today I updated a canary from the pve-no-subscription repo; because there were grub changes, I went to do a grub-install to all the disks.

I note that grub from the 2.02-pve5 packages gives the following error:

Code:
root@k003:/home/andy# grub-install --target=i386-pc /dev/sda

Installing for i386-pc platform.
grub-install: warning: Attempting to install GRUB to a disk with multiple partition labels.  This is not supported yet..
grub-install: error: filesystem `zfs' doesn't support blocklists.

Targeting the EFI BIOS boot partition directly gives:

Code:
root@k003:/home/andy# grub-install --target=i386-pc /dev/sda1
Installing for i386-pc platform.
grub-install: error: unable to identify a filesystem in hostdisk//dev/sda; safety check can't be performed.

Reverting to grub-2.02-pve4 restores the expected behaviour/makes grub consistent with how it behaves elsewhere:

Code:
root@k003:/home/andy# grub-install --target=i386-pc /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
root@k003:/home/andy# grub-install --target=i386-pc /dev/sdb
Installing for i386-pc platform.
Installation finished. No error reported.
root@k003:/home/andy# grub-install --target=i386-pc /dev/sdc
Installing for i386-pc platform.
Installation finished. No error reported.
root@k003:/home/andy# grub-install --target=i386-pc /dev/sdd
Installing for i386-pc platform.
Installation finished. No error reported.


Would you like a bugzilla filed for this?
 
ZFS+UEFI is not supported by the (current) PVE installer (because of the SPOF issue you mention). which iso are you using to install those systems? Grub should support ZFS+EFI without problems, but you need to manually take care of updating the EFI partitions on all but one vdev.

could you post the output of "sgdisk -p YOURDISK" (once for each disk) and "sgdisk -iN YOURDISK" for each partition index as 'N' (e.g., "sgdisk -i1 /dev/sda")
 
ZFS+UEFI is not supported by the (current) PVE installer (because of the SPOF issue you mention). which iso are you using to install those systems?

Don't even remember. PXE-booting or mounting live media in the IPMI; probably an old 4.0 ISO from ~Oct 2015. Then dist-upgrading the system.

Grub should support ZFS+EFI without problems, but you need to manually take care of updating the EFI partitions on all but one vdev.

After updating, grub-2.02pve5 breaks as reported.

All other versions I've upgraded to/through since Oct 2015 or so don't do this - and downgrading to grub*-pve4 fixes it :)

could you post the output of "sgdisk -p YOURDISK" (once for each disk) and "sgdisk -iN YOURDISK" for each partition index as 'N' (e.g., "sgdisk -i1 /dev/sda")

Sure.

Code:
root@k003:~# for unit in a b c d ; do  sgdisk -p /dev/sd${unit} ; done; for unit in a b c d ; do for part in 1 2  9; do printf "\ndisk: sd%s, part #: %s\n" $unit $part; sgdisk -i${part} /dev/sd${unit}; done; done
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 2958DAFB-6BDD-4B5E-8DA5-A12BF76739D2
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 143 sectors (71.5 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  BIOS boot partition
   2            2048      3907012607   1.8 TiB     BF01  zfs
   9      3907012608      3907028991   8.0 MiB     BF07 
Disk /dev/sdb: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): FB5D4686-4122-4DC0-91D8-CFDFBB35CA05
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 143 sectors (71.5 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  BIOS boot partition
   2            2048      3907012607   1.8 TiB     BF01  zfs
   9      3907012608      3907028991   8.0 MiB     BF07 
Disk /dev/sdc: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 576B9F8C-6E4C-40E5-9FE0-A726C4F89E56
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 143 sectors (71.5 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  BIOS boot partition
   2            2048      3907012607   1.8 TiB     BF01  zfs
   9      3907012608      3907028991   8.0 MiB     BF07 
Disk /dev/sdd: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): A3507072-0248-4859-9DD3-6489021515D7
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 143 sectors (71.5 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  BIOS boot partition
   2            2048      3907012607   1.8 TiB     BF01  zfs
   9      3907012608      3907028991   8.0 MiB     BF07 

disk: sda, part #: 1
Partition GUID code: 21686148-6449-6E6F-744E-656564454649 (BIOS boot partition)
Partition unique GUID: AFD6ED55-F055-4786-9A99-94CCF6E40707
First sector: 34 (at 17.0 KiB)
Last sector: 2047 (at 1023.5 KiB)
Partition size: 2014 sectors (1007.0 KiB)
Attribute flags: 0000000000000004
Partition name: 'BIOS boot partition'

disk: sda, part #: 2
Partition GUID code: 6A898CC3-1DD2-11B2-99A6-080020736631 (Solaris /usr & Mac ZFS)
Partition unique GUID: 32DA2CD5-CE30-4E7A-948D-CCDBDBAE70B4
First sector: 2048 (at 1024.0 KiB)
Last sector: 3907012607 (at 1.8 TiB)
Partition size: 3907010560 sectors (1.8 TiB)
Attribute flags: 0000000000000000
Partition name: 'zfs'

disk: sda, part #: 9
Partition GUID code: 6A945A3B-1DD2-11B2-99A6-080020736631 (Solaris Reserved 1)
Partition unique GUID: BF2D5863-B29E-4F35-B8BA-ED6B6A9A9CE2
First sector: 3907012608 (at 1.8 TiB)
Last sector: 3907028991 (at 1.8 TiB)
Partition size: 16384 sectors (8.0 MiB)
Attribute flags: 0000000000000000
Partition name: ''

disk: sdb, part #: 1
Partition GUID code: 21686148-6449-6E6F-744E-656564454649 (BIOS boot partition)
Partition unique GUID: 4A2FFB15-558A-4B5A-A9E9-EA017B927A6B
First sector: 34 (at 17.0 KiB)
Last sector: 2047 (at 1023.5 KiB)
Partition size: 2014 sectors (1007.0 KiB)
Attribute flags: 0000000000000004
Partition name: 'BIOS boot partition'

disk: sdb, part #: 2
Partition GUID code: 6A898CC3-1DD2-11B2-99A6-080020736631 (Solaris /usr & Mac ZFS)
Partition unique GUID: 24D2D847-E5C7-4F06-9FA9-F8D08E2BD1A3
First sector: 2048 (at 1024.0 KiB)
Last sector: 3907012607 (at 1.8 TiB)
Partition size: 3907010560 sectors (1.8 TiB)
Attribute flags: 0000000000000000
Partition name: 'zfs'

disk: sdb, part #: 9
Partition GUID code: 6A945A3B-1DD2-11B2-99A6-080020736631 (Solaris Reserved 1)
Partition unique GUID: 740D9B97-98A8-4C91-BA7D-71B3211ED691
First sector: 3907012608 (at 1.8 TiB)
Last sector: 3907028991 (at 1.8 TiB)
Partition size: 16384 sectors (8.0 MiB)
Attribute flags: 0000000000000000
Partition name: ''

disk: sdc, part #: 1
Partition GUID code: 21686148-6449-6E6F-744E-656564454649 (BIOS boot partition)
Partition unique GUID: 1E32C31D-C440-4328-A8D5-62677D72FB90
First sector: 34 (at 17.0 KiB)
Last sector: 2047 (at 1023.5 KiB)
Partition size: 2014 sectors (1007.0 KiB)
Attribute flags: 0000000000000004
Partition name: 'BIOS boot partition'

disk: sdc, part #: 2
Partition GUID code: 6A898CC3-1DD2-11B2-99A6-080020736631 (Solaris /usr & Mac ZFS)
Partition unique GUID: E59F564C-E22C-4342-BED7-45D93F6C25C1
First sector: 2048 (at 1024.0 KiB)
Last sector: 3907012607 (at 1.8 TiB)
Partition size: 3907010560 sectors (1.8 TiB)
Attribute flags: 0000000000000000
Partition name: 'zfs'

disk: sdc, part #: 9
Partition GUID code: 6A945A3B-1DD2-11B2-99A6-080020736631 (Solaris Reserved 1)
Partition unique GUID: AE0CECD3-6283-42B1-978B-774984D87938
First sector: 3907012608 (at 1.8 TiB)
Last sector: 3907028991 (at 1.8 TiB)
Partition size: 16384 sectors (8.0 MiB)
Attribute flags: 0000000000000000
Partition name: ''

disk: sdd, part #: 1
Partition GUID code: 21686148-6449-6E6F-744E-656564454649 (BIOS boot partition)
Partition unique GUID: 9F066261-EE43-4D36-AB7B-6697EFA1F420
First sector: 34 (at 17.0 KiB)
Last sector: 2047 (at 1023.5 KiB)
Partition size: 2014 sectors (1007.0 KiB)
Attribute flags: 0000000000000004
Partition name: 'BIOS boot partition'

disk: sdd, part #: 2
Partition GUID code: 6A898CC3-1DD2-11B2-99A6-080020736631 (Solaris /usr & Mac ZFS)
Partition unique GUID: 79F6129E-0F69-4D3B-A12C-F94B48BAF55B
First sector: 2048 (at 1024.0 KiB)
Last sector: 3907012607 (at 1.8 TiB)
Partition size: 3907010560 sectors (1.8 TiB)
Attribute flags: 0000000000000000
Partition name: 'zfs'

disk: sdd, part #: 9
Partition GUID code: 6A945A3B-1DD2-11B2-99A6-080020736631 (Solaris Reserved 1)
Partition unique GUID: C5F1ABE4-E971-47D7-B0E2-EC8DA08C5C2C
First sector: 3907012608 (at 1.8 TiB)
Last sector: 3907028991 (at 1.8 TiB)
Partition size: 16384 sectors (8.0 MiB)
Attribute flags: 0000000000000000
Partition name: ''
 
missed this in the first post - you are not using UEFI boot, but just GPT formatted disks. I'll look into the Grub source tomorrow and see where this error messages come from, the sgdisk output looks okay at first glance.
 
could you post the output of "grub-install --verbose --target=i386-pc /dev/sda" (or one of the other disks) with the working and non-working grub versions?
 
Well, huh. I ran grub from pve4 with --verbose first (as it was already installed) and tee'd to a log. Attached as "grub-pve4.txt".

Then I upgraded grub* to pve5 (via apt-get dist-upgrade) and re-ran the command. To my surprise, it worked. Attached as "grub-pve5.txt".

A little more investigation shows that 50% of the drives (sdb, sdc) work with grub-install, and the other 50% (sda, sdd) fail.

wtf....

Code:
root@k003:/home/andy# for unit in a b c d ; do printf "\ndrive: sd%s\n" $unit; grub-install --target=i386-pc /dev/sd${unit}; done

drive: sda
Installing for i386-pc platform.
grub-install: warning: Attempting to install GRUB to a disk with multiple partition labels.  This is not supported yet..
grub-install: error: filesystem `zfs' doesn't support blocklists.

drive: sdb
Installing for i386-pc platform.
Installation finished. No error reported.

drive: sdc
Installing for i386-pc platform.
Installation finished. No error reported.

drive: sdd
Installing for i386-pc platform.
grub-install: warning: Attempting to install GRUB to a disk with multiple partition labels.  This is not supported yet..
grub-install: error: filesystem `zfs' doesn't support blocklists.


I attach verbose output obtained from one of the failed grub pve5 runs as "grub-failed-pve5.txt".
 

Attachments

  • grub-pve4.txt
    132.6 KB · Views: 1
  • grub-pve5.txt
    154.8 KB · Views: 2
  • grub-failed-pve5.txt
    175.9 KB · Views: 5
Last edited:
the actual error / cause is the first message: "grub-install: warning: Attempting to install GRUB to a disk with multiple partition labels. This is not supported yet.."

grub thinks your disk has multiple partition labels, cannot embed itself and tries to install to a FS using the deprecated blocklists mode, which does not work on ZFS.

could you post the output of "lsblk -o NAME,KNAMe,MAJ:MIN,FSTYPE,LABEL,UUID,PARTTYPE,PARTLABEL,PARTUUID,PARTFLAGS,MOUNTPOINT /dev/sd?" (feel free to filter out any VM disks that might be included, not interested in those for now)
 
Hi, I posted to https://forum.proxmox.com/threads/call-for-testing-updated-grub2-packages.30361/#post-155422 about this. Thought I'd plant more seeds here about the issue.

I think there's a problem with the update. I gathered that the pve5 update dropped patches for zfs in favour of upstream support. I haven't gone through the upstream changes. But it's not a trivial jump, the update is from ZoL version to debian sid maintained packages.

Please check out my other post. As I said reverting to pve4 solves the problem. Particularly interesting is that the pve5 grub ultimately boots after about 40 minutes without printing anything about grub such as "GRUB booting..." or whatever.
 
Hi, I posted to https://forum.proxmox.com/threads/call-for-testing-updated-grub2-packages.30361/#post-155422 about this. Thought I'd plant more seeds here about the issue.

I think there's a problem with the update. I gathered that the pve5 update dropped patches for zfs in favour of upstream support. I haven't gone through the upstream changes. But it's not a trivial jump, the update is from ZoL version to debian sid maintained packages.

the ZFS support actually got better by dropping the old unmaintained patch set from ZoL - e.g., Grub now handles missing vdevs a lot better and will import the pool if possible instead of dropping you to a rescue shell where you can't do anything.

Please check out my other post. As I said reverting to pve4 solves the problem. Particularly interesting is that the pve5 grub ultimately boots after about 40 minutes without printing anything about grub such as "GRUB booting..." or whatever.

are you sure you didn't just miss the messages (they are often only displayed for a very short time, and I assume you have not been staring at the monitor for 40 minutes ;)

how many disks are in your zpool?

could you try setting the following in your /boot/grub/grub.cfg and see if you get any debug output when booting? (ideally, both with the "fast" old one and the "slow" updated grub, but you need to edit the grub.cfg after installing any grub packages, because they will regenerate the configuration)
Code:
set pager=1
set debug=all

edit: the content of /boot/grub/grub.cfg and output of "update-grub" might be interesting as well, just to check if there is anything strange going on there..
 
Here's my pool config:
Code:
    NAME                                                 STATE     READ WRITE CKSUM
    rpool                                                ONLINE       0     0     0
      mirror-0                                           ONLINE       0     0     0
        sda2                                             ONLINE       0     0     0
        sdd2                                             ONLINE       0     0     0
      mirror-1                                           ONLINE       0     0     0
        sdc                                              ONLINE       0     0     0
        sdb                                              ONLINE       0     0     0
    logs
      mirror-2                                           ONLINE       0     0     0
        sde1                                             ONLINE       0     0     0
        sdf1                                             ONLINE       0     0     0
    cache
      ata-KINGSTON_SKC400S37128G_50026B726602565E-part2  ONLINE       0     0     0
      ata-KINGSTON_SKC400S37128G_50026B7266025626-part2  ONLINE       0     0     0
    spares
      scsi-35000c50025f95c87                             AVAIL

So no, I did not stare for 40 minutes! ;) I can say that for the 2 minutes or so that I did, nothing at all changed from where the BIOS left off. So not even a screen blanking.

I did turn on pager and full debug. I spent what seemed like 10 minutes on the working pve4 version before I could get the GRUB menu! So I did not repeat the experiment on the pve5 package for lack of time. Maybe I could repeat the test with a bit less verbosity. I will look into this soon.

Here's my grub.cfg http://pastebin.ca/3748585
And the output of update-grub:
Code:
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.35-1-pve
Found initrd image: /boot/initrd.img-4.4.35-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found linux image: /boot/vmlinuz-4.4.21-1-pve
Found initrd image: /boot/initrd.img-4.4.21-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
done

The unknown filesystems occures when grub-probe with --target=fs on /dev/sda2 and /dev/sdd2.

Hope this helps a bit, will post again ASAP.
 
Last edited:
Here's my pool config:
Code:
    NAME                                                 STATE     READ WRITE CKSUM
    rpool                                                ONLINE       0     0     0
      mirror-0                                           ONLINE       0     0     0
        sda2                                             ONLINE       0     0     0
        sdd2                                             ONLINE       0     0     0
      mirror-1                                           ONLINE       0     0     0
        sdc                                              ONLINE       0     0     0
        sdb                                              ONLINE       0     0     0
    logs
      mirror-2                                           ONLINE       0     0     0
        sde1                                             ONLINE       0     0     0
        sdf1                                             ONLINE       0     0     0
    cache
      ata-KINGSTON_SKC400S37128G_50026B726602565E-part2  ONLINE       0     0     0
      ata-KINGSTON_SKC400S37128G_50026B7266025626-part2  ONLINE       0     0     0
    spares
      scsi-35000c50025f95c87                             AVAIL

So no, I did not stare for 40 minutes! ;) I can say that for the 2 minutes or so that I did, nothing at all changed from where the BIOS left off. So not even a screen blanking.

I did turn on pager and full debug. I spent what seemed like 10 minutes on the working pve4 version before I could get the GRUB menu! So I did not repeat the experiment on the pve5 package for lack of time. Maybe I could repeat the test with a bit less verbosity. I will look into this soon.

Here's my grub.cfg http://pastebin.ca/3748585
And the output of update-grub:
Code:
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.35-1-pve
Found initrd image: /boot/initrd.img-4.4.35-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found linux image: /boot/vmlinuz-4.4.21-1-pve
Found initrd image: /boot/initrd.img-4.4.21-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
done

The unknown filesystems occures when grub-probe with --target=fs on /dev/sda2 and /dev/sdd2.

Hope this helps a bit, will post again ASAP.

I tried to reproduce this setup, and it works just fine here (without any delay when booting!). Did you run the update-grub with the old grub version or the new one?

Code:
root@host:~# zpool status
  pool: rpool
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        rpool                                           ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            sda2                                        ONLINE       0     0     0
            sdd2                                        ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            sdc                                         ONLINE       0     0     0
            sdb                                         ONLINE       0     0     0
        logs
          mirror-2                                      ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi5-part1  ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi6-part1  ONLINE       0     0     0
        cache
          scsi-0QEMU_QEMU_HARDDISK_drive-scsi5-part2    ONLINE       0     0     0
          scsi-0QEMU_QEMU_HARDDISK_drive-scsi6-part2    ONLINE       0     0     0
        spares
          scsi-0QEMU_QEMU_HARDDISK_drive-scsi4          AVAIL

errors: No known data errors
root@host:~# update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.35-1-pve
Found initrd image: /boot/initrd.img-4.4.35-1-pve
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
done
root@host:~# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0    32G  0 disk
├─sda1   8:1    0  1007K  0 part
├─sda2   8:2    0    32G  0 part
└─sda9   8:9    0     8M  0 part
sdb      8:16   0    32G  0 disk
├─sdb1   8:17   0    32G  0 part
└─sdb9   8:25   0     8M  0 part
sdc      8:32   0    32G  0 disk
├─sdc1   8:33   0    32G  0 part
└─sdc9   8:41   0     8M  0 part
sdd      8:48   0    32G  0 disk
├─sdd1   8:49   0  1007K  0 part
├─sdd2   8:50   0    32G  0 part
└─sdd9   8:57   0     8M  0 part
sde      8:64   0    32G  0 disk
├─sde1   8:65   0    32G  0 part
└─sde9   8:73   0     8M  0 part
sdf      8:80   0    12G  0 disk
├─sdf1   8:81   0     5G  0 part
└─sdf2   8:82   0     7G  0 part
sdg      8:96   0    12G  0 disk
├─sdg1   8:97   0     5G  0 part
└─sdg2   8:98   0     7G  0 part
sr0     11:0    1 521.8M  0 rom
zd0    230:0    0   3.9G  0 disk [SWAP]
 
Hi,

So we're experiencing this issue right now. We did a dist-upgrade, and grub went from grub-pve4 to grub-pve5. It spends about 30-50 minutes booting, where most of the time is just a blank screen without any blinking cursor or similar.

We tried a downgrade to grub-pve4;

Code:
apt-mark hold pve-kernel-4.4.21-1-pve pve-kernel-4.4.49-1-pve
apt-get install grub-common=2.02-pve4 grub-efi-amd64-bin=2.02-pve4 grub-efi-ia32-bin=2.02-pve4 grub-pc=2.02-pve4 grub-pc-bin=2.02-pve4 grub2-common=2.02-pve4
apt-mark unhold pve-kernel-4.4.21-1-pve pve-kernel-4.4.49-1-pve
dpkg --list|grep grub|awk '{ print $2 }'| xargs apt-mark hold

This works just fine, however, 'update-grub' spits out some error messages;

Code:
root@kakko:~# update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.49-1-pve
Found initrd image: /boot/initrd.img-4.4.49-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found linux image: /boot/vmlinuz-4.4.21-1-pve
Found initrd image: /boot/initrd.img-4.4.21-1-pve
/usr/sbin/grub-probe: error: unknown filesystem.
Found memtest86+ image: /ROOT/pve-1@/boot/memtest86+.bin
Found memtest86+ multiboot image: /ROOT/pve-1@/boot/memtest86+_multiboot.bin
done

Probing / and /boot works fine;

Code:
root@kakko:~# grub-probe /
zfs
root@kakko:~# grub-probe /boot
zfs

Didn't understand why 'grub-probe' failed, so I started looking into 'update-grub'. I noticed that it's just a wrapper for 'grub-mkconfig' (which, in turn, is just a sh-script). It tries to find/populate the following variables (in order for it to determine the filesystem);

Code:
GRUB_DEVICE="`${grub_probe} --target=device / | head -n1`"
GRUB_DEVICE_UUID="`${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid 2> /dev/null`" || true
GRUB_DEVICE_BOOT="`${grub_probe} --target=device /boot | head -n1`"
GRUB_DEVICE_BOOT_UUID="`${grub_probe} --device ${GRUB_DEVICE_BOOT} --target=fs_uuid 2> /dev/null`" || true
GRUB_FS="`${grub_probe} --device ${GRUB_DEVICE} --target=fs 2> /dev/null || echo unknown`"

They are executed in that order, and manually trying them;

Code:
root@kakko:~# grub_probe="/usr/sbin/grub-probe"
root@kakko:~# GRUB_DEVICE="`${grub_probe} --target=device / | head -n1`"
root@kakko:~# echo "GRUB_DEVICE: $GRUB_DEVICE"
GRUB_DEVICE: /dev/sdb2
root@kakko:~# ${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid
/usr/sbin/grub-probe: error: unknown filesystem.
root@kakko:~# grub-probe --device /dev/sdb2
grub-probe: error: unknown filesystem.

So we see that this is the culprit that fails. On a working installation (running grub-pve4);

Code:
root@hagewashi:~# grub_probe="/usr/sbin/grub-probe"
root@hagewashi:~# GRUB_DEVICE="`${grub_probe} --target=device / | head -n1`"
root@hagewashi:~# echo "GRUB_DEVICE: $GRUB_DEVICE"
GRUB_DEVICE: /dev/sdb2
root@hagewashi:~# ${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid
236edab57a1bd1f8
root@hagewashi:~# grub-probe --device /dev/sdb2
zfs

In it's current state, I'm not tempted to reboot this machine, as I fear that it will end up in a grub console or in a state where it won't boot.

Any takers on how to solve this? Please let me know what debug-information is needed in order to troubleshoot this further.
 
Last edited:
Some more info of the system.

Code:
root@kakko:~# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 25.4G in 2h28m with 0 errors on Wed Apr  5 15:40:17 2017
config:

    NAME        STATE     READ WRITE CKSUM
    rpool       ONLINE       0     0     0
     raidz2-0  ONLINE       0     0     0
       sdb2    ONLINE       0     0     0
       sde2    ONLINE       0     0     0
       sdd2    ONLINE       0     0     0
       sdf2    ONLINE       0     0     0
       sdg2    ONLINE       0     0     0
       sdc2    ONLINE       0     0     0

errors: No known data errors


root@kakko:~# for disk in `ls -1 /dev/sd[b-z]`; do grub-install $disk ; done
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.



root@kakko:~# for disk in `ls -1 /dev/sd[b-z]`; do parted $disk print ; done
Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sdb: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB

Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sdc: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB

Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sdd: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB

Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sde: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB

Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sdf: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB

Model: ATA WDC WD2500HHTZ-0 (scsi)
Disk /dev/sdg: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  1049kB  1031kB                     bios_grub
 2      1049kB  250GB   250GB   zfs          zfs
 9      250GB   250GB   8389kB





root@kakko:~# for disk in `ls -1 /dev/sd[b-z]`; do sgdisk -p $disk ; done
Disk /dev/sdb: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 03BB7931-F754-43CC-8244-4100F90593A9
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
Disk /dev/sdc: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 1128F15C-0769-4FBB-96CC-7731741B855B
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
Disk /dev/sdd: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 7E69E085-5FE7-405E-86EC-4BA5E288A74A
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
Disk /dev/sde: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): E3C67DAC-40F6-492A-A2B8-19064038C5B5
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
Disk /dev/sdf: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): B522B266-68D5-4512-9AA2-FAF1326A3D01
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
Disk /dev/sdg: 488397168 sectors, 232.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 5A5FA03A-F20A-4D3C-A4F9-581014BE9D22
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 488397134
Partitions will be aligned on 2-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              34            2047   1007.0 KiB  EF02  
   2            2048       488380749   232.9 GiB   BF01  zfs
   9       488380750       488397134   8.0 MiB     BF07  
root@kakko:~#
 
could you give the output of "blkid /dev/sdb2" and "udevadm info /dev/sdb2" ? thanks
 
Code:
root@kakko:~# blkid /dev/sdb2
/dev/sdb2: LABEL="rpool" UUID="6641772703085228546" UUID_SUB="1864975891195802099" TYPE="zfs_member" PARTLABEL="zfs" PARTUUID="d4759e21-48a4-4e50-9bf4-b8def569c184"



root@kakko:~# udevadm info /dev/sdb2
P: /devices/pci0000:00/0000:00:03.0/0000:01:00.0/host4/port-4:0/end_device-4:0/target4:0:0/4:0:0:0/block/sdb/sdb2
N: sdb2
S: disk/by-id/ata-WDC_WD2500HHTZ-04N21V0_WD-WX11E83UZ597-part2
S: disk/by-id/wwn-0x50014ee6596d7dfa-part2
S: disk/by-partlabel/zfs
S: disk/by-partuuid/d4759e21-48a4-4e50-9bf4-b8def569c184
S: disk/by-path/pci-0000:01:00.0-sas-0x1221000000000000-lun-0-part2
E: DEVLINKS=/dev/disk/by-id/ata-WDC_WD2500HHTZ-04N21V0_WD-WX11E83UZ597-part2 /dev/disk/by-id/wwn-0x50014ee6596d7dfa-part2 /dev/disk/by-partlabel/zfs /dev/disk/by-partuuid/d4759e21-48a4-4e50-9bf4-b8def569c184 /dev/disk/by-path/pci-0000:01:00.0-sas-0x1221000000000000-lun-0-part2
E: DEVNAME=/dev/sdb2
E: DEVPATH=/devices/pci0000:00/0000:00:03.0/0000:01:00.0/host4/port-4:0/end_device-4:0/target4:0:0/4:0:0:0/block/sdb/sdb2
E: DEVTYPE=partition
E: ID_ATA=1
E: ID_ATA_DOWNLOAD_MICROCODE=1
E: ID_ATA_FEATURE_SET_APM=1
E: ID_ATA_FEATURE_SET_APM_CURRENT_VALUE=254
E: ID_ATA_FEATURE_SET_APM_ENABLED=1
E: ID_ATA_FEATURE_SET_HPA=1
E: ID_ATA_FEATURE_SET_HPA_ENABLED=1
E: ID_ATA_FEATURE_SET_PM=1
E: ID_ATA_FEATURE_SET_PM_ENABLED=1
E: ID_ATA_FEATURE_SET_PUIS=1
E: ID_ATA_FEATURE_SET_PUIS_ENABLED=0
E: ID_ATA_FEATURE_SET_SECURITY=1
E: ID_ATA_FEATURE_SET_SECURITY_ENABLED=0
E: ID_ATA_FEATURE_SET_SECURITY_ENHANCED_ERASE_UNIT_MIN=24
E: ID_ATA_FEATURE_SET_SECURITY_ERASE_UNIT_MIN=24
E: ID_ATA_FEATURE_SET_SMART=1
E: ID_ATA_FEATURE_SET_SMART_ENABLED=1
E: ID_ATA_ROTATION_RATE_RPM=10000
E: ID_ATA_SATA=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1
E: ID_ATA_WRITE_CACHE=1
E: ID_ATA_WRITE_CACHE_ENABLED=1
E: ID_BUS=ata
E: ID_FS_LABEL=rpool
E: ID_FS_LABEL_ENC=rpool
E: ID_FS_TYPE=zfs_member
E: ID_FS_USAGE=raid
E: ID_FS_UUID=6641772703085228546
E: ID_FS_UUID_ENC=6641772703085228546
E: ID_FS_UUID_SUB=1864975891195802099
E: ID_FS_UUID_SUB_ENC=1864975891195802099
E: ID_FS_VERSION=5000
E: ID_MODEL=WDC_WD2500HHTZ-04N21V0
E: ID_MODEL_ENC=WDC\x20WD2500HHTZ-04N21V0\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
E: ID_PART_ENTRY_DISK=8:16
E: ID_PART_ENTRY_NAME=zfs
E: ID_PART_ENTRY_NUMBER=2
E: ID_PART_ENTRY_OFFSET=2048
E: ID_PART_ENTRY_SCHEME=gpt
E: ID_PART_ENTRY_SIZE=488378702
E: ID_PART_ENTRY_TYPE=6a898cc3-1dd2-11b2-99a6-080020736631
E: ID_PART_ENTRY_UUID=d4759e21-48a4-4e50-9bf4-b8def569c184
E: ID_PART_TABLE_TYPE=gpt
E: ID_PART_TABLE_UUID=03bb7931-f754-43cc-8244-4100f90593a9
E: ID_PATH=pci-0000:01:00.0-sas-0x1221000000000000-lun-0
E: ID_PATH_TAG=pci-0000_01_00_0-sas-0x1221000000000000-lun-0
E: ID_REVISION=04.06A00
E: ID_SERIAL=WDC_WD2500HHTZ-04N21V0_WD-WX11E83UZ597
E: ID_SERIAL_SHORT=WD-WX11E83UZ597
E: ID_TYPE=disk
E: ID_WWN=0x50014ee6596d7dfa
E: ID_WWN_WITH_EXTENSION=0x50014ee6596d7dfa
E: MAJOR=8
E: MINOR=18
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=607019
 
Couldn't paste the output of 'grub-probe -vv --device /dev/sdb2' here on the forums (too big/long text), so I made it available here; http://files.jocke.no/b/20170406-kakko_grub-probe.debug.txt

I looked at the lines after 'grub-core/kern/fs.c:56: Detecting zfs...', and thought maybe all the "read or write outside of disk"-messages where relevant. But they are present in the output from the working host as well, so then I don't know...
 
Couldn't paste the output of 'grub-probe -vv --device /dev/sdb2' here on the forums (too big/long text), so I made it available here; http://files.jocke.no/b/20170406-kakko_grub-probe.debug.txt

I looked at the lines after 'grub-core/kern/fs.c:56: Detecting zfs...', and thought maybe all the "read or write outside of disk"-messages where relevant. But they are present in the output from the working host as well, so then I don't know...

your udevadm output looks correct.

could you clarify with which version you got the linked traces? because according to the content, I would assume 2.02-pve4 (or older), but the final lines say "unknown filesystem", which you said occurs with the new version 2.02-pve5..
 
could you clarify with which version you got the linked traces? because according to the content, I would assume 2.02-pve4 (or older), but the final lines say "unknown filesystem", which you said occurs with the new version 2.02-pve5..

The linked traces was done with 2.02-pve4 (which we downgraded to). This is the version where "unknown filesystem" happens. This was done since others had success of downgrading to 2.02-pve4 (in order to resolve the long boot time), but since we encountered the "unknown filesystem" when doing a grub-update, we haven't rebooted (to see wether or not 2.02-pve4 solves the long boot time).

We can try to go back to 2.02-pve5 again, but I suspect that won't solve much, unless there is a way for us to help with debugging why 2.02-pve5 seems to break things...
 
thanks for the clarification. it's very hard to get to the bottom of such issues without access to a system where the issue occurs..

  • when you boot with 2.02-pve5, does the hang occur before or after the grub menu is displayed?
  • does it eventually boot (and just take "forever"), or did you revert by booting from a live-CD or similar?
  • how was the system originally installed? if with the PVE installer, in which version?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!