Root ZFS on LUKS | Last questions (hopefully :) | Delaying the zfs import during boot

Anotheruser

Member
Sep 21, 2022
70
19
13
So after a couple additional days of troubleshooting i think i am finally quite close to getting it fully working and
as soon as thats the case i am planing to write a full guide for all people that want to do a similar setup in the future since
i couldnt find any real guides for proxmox / debian based zfs on luks root setups and zfs native encryption is not feasible for a lot of people since it breaks the ability to migrate vms and also exposes all the vm ids.

Setup:
  • Two Disks (sda with luks1 and sdb with luks2)
  • ZFS Raid1 on two LUKS Mapped devices
  • UEFI / Systemd-boot
  • Fresh Proxmox originally Installed via official ISO

What i did:
  1. Normal Proxmox ZFS Raid1 UEFI install via the official ISO
  2. Installed cryptsetup and added dmcrypt to /etc/initramfs-tools/modules and updated initramfs
  3. Livebooted Debian, deleted sda3 on the first disks with fdisk so only efi and boot partition were left
  4. Recreated sdX3 as a luks volume, imported it and used zfs replace to resilver the zfs pool to the luks mapped device
  5. Added luks1 UUID={uuid of sda3 found via blkid) none luks,discard to /etc/crypttab
  6. I also changed the /etc/kernel/cmdline to cryptdevice=/dev/sda3:luks1 root=ZFS=rpool/ROOT/pve-1 resume=/dev/mapper/luks1 rw boot=zfs
So far mostly working (in this state only one disk is running on luks the second one hasnt been touched yet). The pool is in a healthy state and can be written to and i am prompted to enter the passphrase for luks1 during boot.

Problem:
The thing is as soon as i reboot the server the zfs pool switches to DEGRADED with the error One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. the luks device is just shown as random numbers and UNAVAIL but if we inspect the luks mapped device, it is there and fully working.
A single zpool clear rpool also fixes this and the pool is shown as healthy and fully working again until after the next reboot

Problem - My Theroy:
ZFS apparently tries to initialize the pool when the mapped device is not ready yet.
To inspect that i ran systemd-analyze critical-chain --fuzz=1m zfs.target which produced the following output (see screenshot1)
From my understanding that mean zfs-import.target is run before the systemd-cryptsetup, correct?

I already tried to following this guide https://bbs.archlinux.org/viewtopic.php?id=184043 to delay the import with adding After=cryptsetup.target and After=systemd-cryptsetup@luks1.service to a bunch of /etc/systemd/system/zfs* related services but that didnt change anything.

I also tested to just remove the second disk, so only the luks one is left but this just resulted in being stuck at the efi stub loaded initrid from linux efi initrid media guid device path screen indefinitely, so migrating the second disk to luks at this point is not possible and will only resulting in the same efi error screen.

Any help and tips are really appreciated, tried basically everything i know / could find and i think i am pretty close to get it finally working :)

PS: for the devs, i know this has probably already been brought up but implementing native luks encryption into the installer would be huge
 

Attachments

  • Screenshot1.png
    Screenshot1.png
    57.1 KB · Views: 7
  • Screenshot2.png
    Screenshot2.png
    8 KB · Views: 7
Last edited:
  • Like
Reactions: Dunuin
Did it work for you?
I have it working with GRUB for years, but a new install with UEFI-boot and having initramfs in crypttab still tries to mount ZFS before decrypting the drive.

I currently have a mirror (one encrypted and one encrypted drive). It only boots when the unencrypted device it present.
During boot, I am asked for password, but After booting, the encrypted drive is shown is missing, even though it is decrypted. A "zpool online" for the device reintegrates it into the pool, but booting does not work.

How to delay systemd zfs-mount after decryption (as it worked in GRUB automatically with the initramfs option)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!