[Proxmox 3.4] Wrong partitioning by installer / Grub boot ZFS problem

onlime

Renowned Member
Aug 9, 2013
76
14
73
Zurich, Switzerland
www.onlime.ch
Hi there
We have now installed Proxmox VE 3.4 with ZFS RAID-1 on more than 6 servers with 2 disks each (attached to LSI 9207-8i HBAs), without any issues.
On several other servers with 4 disks and ZFS RAID-10 we are experiencing the same issue as described under http://pve.proxmox.com/wiki/ZFS#Grub_boot_ZFS_problem :


  • Symptoms: stuck at boot with an blinking prompt
  • Reason: If you ZFS raid it could happen that your mainboard does not initial all your disks correctly and Grub will wait for all RAID disk members - and fails. It can happen with more than 2 disks in ZFS RAID configuration - we saw this on some boards with ZFS RAID-0/RAID-10

I have managed to boot the servers by changing the boot drive in BIOS (after trying to fix Grub2 via Proxmox VE Installer rescue mode first...). With some luck, you will get the correct drive and will be able to boot.
Here's the problem: Proxmox VE 3.4 Installer does not seem to partition all disks correctly on a 4 (and possibly more) disk setup, ZFS RAID-10:

Code:
$ parted /dev/sda print
Model: ATA INTEL SSDSC2BA80 (scsi)
Disk /dev/sda: 800GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 
Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   800GB   800GB   zfs          PVE-ZFS-Partition

$ parted /dev/sdb print
Model: ATA INTEL SSDSC2BA40 (scsi)
Disk /dev/sdb: 400GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 
Number  Start   End    Size    File system  Name  Flags
 1      1049kB  400GB  400GB   zfs          zfs
 9      400GB   400GB  8389kB

$ parted /dev/sdc print
Model: ATA INTEL SSDSC2BA40 (scsi)
Disk /dev/sdc: 400GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 
Number  Start   End    Size    File system  Name  Flags
 1      1049kB  400GB  400GB   zfs          zfs
 9      400GB   400GB  8389kB

$ parted /dev/sdd print
Model: ATA INTEL SSDSC2BA80 (scsi)
Disk /dev/sdd: 800GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 
Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   800GB   800GB   zfs          PVE-ZFS-Partition

In this example, /dev/sdb and /dev/sdc are missing the Grub-Boot and EFI partitions. Writing Grub to those disks will not succeed:

Code:
$ grub-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.

$ grub-install /dev/sdb
Installing for i386-pc platform.
grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
grub-install: error: filesystem 'zfs' doesn't support blocklists.

$ grub-install /dev/sdc
Installing for i386-pc platform.
grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won't be possible.
grub-install: error: filesystem 'zfs' doesn't support blocklists.

$ grub-install /dev/sdd
Installing for i386-pc platform.
Installation finished. No error reported.

This is a serious issue. You would be able to run a server in production in this condition but not having Grub installed on all drives is quite risky!
Please fix or explain why this happened. Thanks.
Best regards,
Philip
 
We only create boot partitions if it make sense. For example on RAID0, we only create it on the first disk.
 
We only create boot partitions if it make sense.

Ok, so this is on purpose. I totally understand the reason and writing Grub to a second disk in a stripe actually is unnecessary. Still, I would argument it's better to write Grub to all disks - it makes the boot process less error prone.
It looks like there are already a couple of people struggling with boot problems and I am pretty sure some of them just stumbled upon the same thing. 1MB for a boot partition doesn't harm anybody, and probably everybody could even live with another wasted 134MB for the EFI partition.
Boot problems are most annoying for any administrator and with the added complexity of Grub2 and ZFS, a recovery from an unbootable system can be quite a challenge. If we could lower this risk a tiny little bit, I would go for this solution.

Besides, Dietmar, great job you did with the Proxmox VE 3.4 installer and the integrated ZoL! It worked flawlessly on all systems so far. We are very happy Proxmox now got ZFS on the boat.

Are the problematical drives directly connected to the MB SATA ports?

As noted above, we are running our SSDs attached to a HBA, namely the following two models: LSI/Avago 9207-8i and (the older) 9211-8i. We have the IT (initiator target) firmware installed on all of them.
In BIOS, the drives will show up e.g. like this:


  • PCI SCSI: #0600 ID0A LUN0 SEAGATE
  • PCI SCSI: #0600 ID09 LUN0 SEAGATE

At this point, there is no way to tell which drive is which, as ID09/ID0A/ID0B/ID0C/... is pretty cryptic and ID09 may not be placed before ID0A in your enclosure slots.
If you select the wrong drive (and the whole boot problem even happened to me without changing the order in BIOS!), there is just a 50% chance to get the correct one in a 4 disk RAID-10 setup.
I would definitely prefer having Grub on all drives installed.

Best regards,
Philip
 
Ok, I wasn't sure - the way you wrote your original post I read it as the servers with the HBA were working normally then you had 4 additional servers that possibly didn't have that adapter, OR had the raid array connected straight to the motherboard. I had a similar problem that I resolved, but it was specific to the drives directly connected to the motherboard rather then through a separate controller.
 
Ok, so this is on purpose. I totally understand the reason and writing Grub to a second disk in a stripe actually is unnecessary. Still, I would argument it's better to write Grub to all disks - it makes the boot process less error prone.

Why? It does not help it you add a boot partition to the other drives, because the system cannot boot anyways (in above case).
 
Why? It does not help it you add a boot partition to the other drives, because the system cannot boot anyways (in above case).

I am talking about booting a sane system without any failed disks but with changed (reordered) boot drives in BIOS.
On a failed disk, I get your point. But if all disks are running, Grub should be able to boot from all of them. I thought Grub goes by disk UUID and BIOS boot ordering should not have any influence on that.

Best regards,
Philip
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!