[SOLVED] 'LoaderSystemToken' EFI variable: Invalid argument - HP t730 thin client, ZFS & UEFI...

virtManager · Feb 17, 2022

Hi all,

I'm very frustrated because I bought myself a HP t730 thin client and I think I've got the latest BIOS version: "L43, 1.15" (but it's still old, I don't think they're updating it anymore). I've tried to install proxmox using both the latest ISO and the one before. I want to install proxmox on a ZFS-partition of the internal 32 GB SSD - but it fails, each and every time with the message: "Installation failed!" and "Proxmox VE could not be installed". A popup window says: "bootloader setup errors: - unable to init ESP and install proxmox-boot loader on '/dev/sda2'. I also tried to install to a usb flash drive - in that case it was the same message, except /dev/sb2... I've tried installing around 10-12 times now... I don't understand the error message. The log file ends with:

# mount -n --bind /dev /rpool/ROOT/pve-1/dev
# chroot /rpool/ROOT/pve-1 /usr/sbin/update-initramfs -c -k 5.13.19-2-pve
update-initramfs: Generating /boot/initrd.img-5.13.19-2-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
# chroot /rpool/ROOT/pve-1 proxmox-boot-tool init /dev/sda2
Re-executing '/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="DFF0-DE8F" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Mounting '/dev/sda2' on '/var/tmp/espmounts/DFF0-DE8F'.
Installing systemd-boot..
Created "/var/tmp/espmounts/DFF0-DE8F/EFI/systemd".
Created "/var/tmp/espmounts/DFF0-DE8F/EFI/BOOT".
Created "/var/tmp/espmounts/DFF0-DE8F/loader".
Created "/var/tmp/espmounts/DFF0-DE8F/loader/entries".
Created "/var/tmp/espmounts/DFF0-DE8F/EFI/Linux".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/DFF0-DE8F/EFI/systemd/systemd-bootx64.efi".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/DFF0-DE8F/EFI/BOOT/BOOTX64.EFI".
Random seed file /var/tmp/espmounts/DFF0-DE8F/loader/random-seed successfully written (512 bytes).
Failed to write 'LoaderSystemToken' EFI variable: Invalid argument

I always get "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument" and it never reboots correctly. I'm attaching the full install-log. Question: Does anyone have a clue about what could be the problem - or the solution to this? If so, I'm eager to hear from you, thanks!

Stoiko Ivanov · Feb 17, 2022

virtManager said:
Failed to write 'LoaderSystemToken' EFI variable: Invalid argument

This usually indicates that the system could not write an efi-variable (i.e. set a setting in the firmware/BIOS of the system) - In my experience this has 2 potential causes:
* a setting in the BIOS/Firmware prevents 'Access to EFI variables from OS' (sadly I cannot tell you how this might be called in your specific case because vendors always come up with their own names
* the BIOS is simply buggy

anyways I would check the BIOS for any related setting and experiment with those
* I do recommend upgrading the BIOS if at all possible

finally you can try switching to legacy bootmode (a.k.a BIOS boot, CSM, compatibility mode,...) - and installing PVE this way (with grub instead of systemd-boot)

I hope this helps!

virtManager · Feb 17, 2022

Stoiko Ivanov said:
This usually indicates that the system could not write an efi-variable (i.e. set a setting in the firmware/BIOS of the system) - In my experience this has 2 potential causes:
* a setting in the BIOS/Firmware prevents 'Access to EFI variables from OS' (sadly I cannot tell you how this might be called in your specific case because vendors always come up with their own names
* the BIOS is simply buggy

Yes, I think the BIOS is buggy... I've disabled secure boot - FURTHERMORE: I can install with ext4 in uefi-mode and it works... So the problem is just ZFS with RAID0... Damn, I hope ext4 is good enough to survive occassional power outages in the long run because that will happen and I don't plan on buying a small UPS (it's just a home-server)...

Stoiko Ivanov said:
anyways I would check the BIOS for any related setting and experiment with those
* I do recommend upgrading the BIOS if at all possible

I bought the machine from ebay, spent around 2 days installing windows and using the windows programs to check for updates. Finally I came to the conclusion that "L43, 1.15" must be the latest BIOS-version and I already had that. Really annoying with ZFS - but obviously I'm happy that at least ext4 works (although in this case I don't think it matters much if I booted in uefi/legacy mode)...

Stoiko Ivanov said:
finally you can try switching to legacy bootmode (a.k.a BIOS boot, CSM, compatibility mode,...) - and installing PVE this way (with grub instead of systemd-boot)

I did that - not working. I assume "with grub instead of systemd-boot" is something the installer would automatically do, right? In this case, I boot off the USB (actually via "ventoy" that can store many iso's/images) with BIOS in "legacy" instead of "uefi"-mode. Everything goes fine - it even ends up claiming the installation suceeded, no problems/errors shown. When I then reboot it doesn't appear in the "F9" boot menu.

So, conclusion is: There's a problem with using ZFS on this machine (at least I cannot figure out what I'm doing wrong)... Thanks a lot for the suggestions! If anyone has ideas/suggestions, I'm still happy to test/try them out - if I don't see anything, I'll go with ext4... Hopefully ext4 will turn out to be okay, although not as fun/interesting as with zfs...

RolandK · Feb 18, 2022

can you try install pve 6.4 ?

I'm actually struggling with a similar problem with a HP T520 Thin client. There is no bios updates available for fixing uefi issues, so i'm trying to find the root cause and found by chance, that installation with 6.4 works without a problem and i can also update to 7.1 from that.

roland

>"L43, 1.15" must be the latest BIOS-version and I already had that.
does hp list bios for your model in downloads or how did you update the bios? i searched for a while but found no update method/part and all references to 1.12 bios (which seems the most recent one) are outdated.

Dunuin · Feb 18, 2022

ZFS is by the way also not preventing the system disk to currupt on power outages. Especially not if you are using consumer SSDs without a build in powerloss protection (basically all enterprise grade SSDs should got one, so the cached data inside the SSD isn'T lost on an power outage as it got backup power for some seconds to dump the internal RAM cache to NAND). But even with enterprise grade SSDs you still will loose the cached data in your systems RAM. If you know that there will be power outages and you don't want that stuff corrupts, buying a UPS really is the only option if you want to allow async writes. You get new and well working ones for as less as 50€. So they are really worth it.

Just search this forum. There are alot of threads where people lost a complete mirrored ZFS pool on an power outage because they were using consumer SSDs and no UPS so both disks corrupted at the same time.

RolandK · Feb 18, 2022

anyhow, booting is done from partition1 and partition2 or from mbr, so i don't see the point why installation fails with this error message or why we won't even see boot menu when ext4 works.

it must be something related to the zfs setup process

i had taken a closer look from inside the chroot at setup time and it seems lsblk fails to provide partition information like PARTTYPE which makes proxmox-boot-tool fail with "/dev/sda2 has wrong partiton type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b)

which is weird, because with cfdisk partition type is shown correctly for sda2

will look deeper if i have some more time

@Stoiko Ivanov

virtManager · Feb 18, 2022

RolandK said:
can you try install pve 6.4 ?

Yes I also had a really really bad feeling about this, so it's a REALLY good thing you mention it... I tried pve 7.1 - then 7.0 - many different combinations of settings. I actually for a short while wanted to do this myself, but then I thought that 7.0 must be enough and if nobody said there were any issue I shouldn't go further back... But now where you asked...... IT WORKS!!!

It automatically rebooted, then I logged in as root, wrote ls /sys/firmware/efi - and there's some stuff like "efivars" as I expected. Next check:

Code:

# zfs list
NAME               USED  AVAIL     REFER  MOUNTPOINT
rpool             1.35G   227G      104K  /rpool
rpool/ROOT        1.35G   227G       96K  /rpool/ROOT
rpool/ROOT/pve-1  1.35G   227G     1.35G  /
rpool/data          96K   227G       96K  /rpool/data

Looking good! Since I didn't have anything to backup, I now just continued, doing something like this:

Code:

Comment out the line, responsible for errors (Updating from such a repository can't be done securely) when trying to update:
# vi /etc/apt/sources.list.d/pve.enterprise.proxmox.com
# apt update && apt -y full-upgrade

Add the no-subscription repository:
# vi /etc/apt/sources.list
deb http://download.proxmox.com/debian/pve buster pve-no-subscription

Update and reboot:
# apt update && apt -y full-upgrade
# reboot

After reboot:
# pve6to7
# pve6to7 --full
# apt update
# apt dist-upgrade

Update all Debian repository entries to Bullseye.
# sed -i 's/buster\/updates/bullseye-security/g;s/buster/bullseye/g' /etc/apt/sources.list
# echo "#deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise" >> /etc/apt/sources.list.d/pve-enterprise.list

More update:
# apt update
# apt dist-upgrade
(what was really annoying here was that it didn't understand to use xterm terminal settings, so it kept saying something like "WARNING: terminal is not fully functional" and backspace didn't visually work etc)...

It ended up like:

Code:

Unknown terminal: alacritty
Check the TERM environment variable.
Also make sure that the terminal is defined in the terminfo database.
Alternatively, set the TERMCAP environment variable to the desired
termcap entry.
debconf: whiptail output the above errors, giving up!
dpkg: error processing package libc6:amd64 (--configure):
 installed libc6:amd64 package post-installation script subprocess returned error exit status 255
Errors were encountered while processing:
 libc6:amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)

At this point I knew I had to exit my ssh-session and I switched to "terminator", ssh'd back and thought I would continue with apt dist-upgrade, but I got:

Code:

ssh root@192.168.1.110                                                                                                                            255
kex_exchange_identification: read: Connection reset by peer
Connection reset by 192.168.1.110 port 22

So I took over the keyboard and with HDMI screen output to a tv, I continued with apt --fix-broken install followed by apt dist-upgrade and finally # apt autoremove. Then I rebooted. IT WORKS

Here's the proof:

Code:

root@mox:~# pveversion
pve-manager/7.1-10/6ddebafe (running kernel: 5.13.19-4-pve)

root@mox:~# ls /sys/firmware/efi/
config_table  efivars  esrt  fw_platform_size  fw_vendor  runtime  runtime-map    systab    vars

root@mox:~# zpool status -v
  pool: rpool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:04 with 0 errors on Fri Feb 18 21:13:09 2022
config:

    NAME                                                   STATE     READ WRITE CKSUM
    rpool                                                  ONLINE       0     0     0
      ata-SAMSUNG_MZNLN256HCHP-000H1_S205NXAH341372-part3  ONLINE       0     0     0

errors: No known data errors

root@mox:~# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool   236G  2.18G   234G        -         -     0%     0%  1.00x    ONLINE  -

root@mox:~# fdisk -l
Disk /dev/sda: 238.47 GiB, 256060514304 bytes, 500118192 sectors
Disk model: SAMSUNG MZNLN256
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: A13B999B-D4EC-4F3C-AAF0-1E77506C10AD

Device       Start       End   Sectors  Size Type
/dev/sda1       34      2047      2014 1007K BIOS boot
/dev/sda2     2048   1050623   1048576  512M EFI System
/dev/sda3  1050624 500118158 499067535  238G Solaris /usr & Apple ZFS
...
... (skipping usb output)

CONCLUSION: I don't think it's my BIOS - I think there's a bug with ZFS filesystem-installation, in the proxmox 7.0 and 7.1 installers!

Question: Should I report this (presumable bug) somewhere so it can be fixed in the future (I don't like having to do this work-around in the future, if I can avoid it and I'm sure other people think like me)?

RolandK said:
I'm actually struggling with a similar problem with a HP T520 Thin client. There is no bios updates available for fixing uefi issues, so i'm trying to find the root cause and found by chance, that installation with 6.4 works without a problem and i can also update to 7.1 from that.

I agree - installing with 6.4 works, thanks a lot!

RolandK said:
>"L43, 1.15" must be the latest BIOS-version and I already had that.
does hp list bios for your model in downloads or how did you update the bios? i searched for a while but found no update method/part and all references to 1.12 bios (which seems the most recent one) are outdated.

I didn't update the BIOS myself, I bought it from ebay. But I did spend around 2 days trying to figure out if I really had the latest BIOS version or not (this involved installing Windows 10, running some Windows HP Upgrade applications, which didn't tell me about any updates). I was unsure about whether or not the Windows HP update tool told me if I was on the latest BIOS-version, finally I found out there's another tool I could use and this I think showed all kind of stuff for different HP machines. In my case, for my machine, it said the same L43 v.1.15-stuff as I could see in my BIOS. Based on that - although I've never updated the machine myself - I came to the conclusion that I think I'm already on the latest BIOS-version. Would however be nice if the software tool told me that directly like "you're on the latest BIOS-version, no updates available", instead of forcing me to spend 2 days.

Please see the attached screenshot, that convinced my that I'm on the latest BIOS-version - I hope it helps - and thanks a lot for the help, I truly appreciate it, thanks!

HP_DEVICE_MANAGER_Windows_app_telling_about_BIOS_versions_lowres.jpg

RolandK · Feb 18, 2022

when i execute proxmox-boot-tool from inside the chroot installation (with bind mounts) after that error message "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument", i observe something weird.

proxmox-boot-tool is telling that partition has the wrong type and exits.

in proxmox-boot-tool this information is determined via lsblk - and apparently lsblk doesn't get to these necessary informations, proxmox-boot-tool relies on.

i have no clue why, sfdisk (and also cfdisk) show UUID and PARTTYPE just fine

Code:

#lsblk --bytes --pairs -o 'UUID,SIZE,FSTYPE,PARTTYPE,PKNAME,MOUNTPOINT' /dev/sda

UUID="" SIZE="32017047552" FSTYPE="" PARTTYPE="" PKNAME="" MOUNTPOINT=""
UUID="" SIZE="1031168" FSTYPE="" PARTTYPE="" PKNAME="sda" MOUNTPOINT=""
UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="" PKNAME="sda" MOUNTPOINT=""
UUID="" SIZE="31479111168" FSTYPE="" PARTTYPE="" PKNAME="sda" MOUNTPOINT=""


#sfdisk -d /dev/sda

label: gpt
label-id: C2B34533-FB74-45EB-8CC3-660EEB1A2B14
device: /dev/sda
unit: sectors
first-lba: 34
last-lba: 62533262

/dev/sda1 : start=          34, size=        2014, type=21686148-6449-6E6F-744E-656564454649, uuid=173067CC-5CD7-464C-BD06-4729301C0F48
/dev/sda2 : start=        2048, size=     1048576, type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B, uuid=75D0470F-B0EC-4E53-A8BC-C32E8ED2ABB3
/dev/sda3 : start=     1050624, size=    61482639, type=6A898CC3-1DD2-11B2-99A6-080020736631, uuid=71D4EE21-9D96-4E41-891F-C125BE5BC221

RolandK · Feb 18, 2022

@virtManager , thanks for sharing your findings. our posts must have crossed each other

>CONCLUSION: I don't think it's my BIOS - I think there's a bug with ZFS filesystem-installation, in the proxmox 7.0 and 7.1 installers!

yes, i think this is correct , i have come to the same conclusion

let's wait what @Stoiko Ivanov will think on this

virtManager · Feb 18, 2022

Dunuin said:
Just search this forum. There are alot of threads where people lost a complete mirrored ZFS pool on an power outage because they were using consumer SSDs and no UPS so both disks corrupted at the same time.

I must admit that when choosing ZFS over EXT4, I think it's really difficult to figure out which is better because there are many posts saying both this and that. So - I think you're right - neither ZFS nor EXT4 are 100% resilient towards power failures. But still: Isn't ZFS better? At least it was my impression. When I google it (I'll continue to do), I got a feeling this a like a religous war between people on either side... Anyway, I'll be sure to make regular whole-disk backup images so I can restore everything if thing go south... I'll search this forum, to read more on these stories, I don't think I've read enough about this yet, thanks for the info!

RolandK · Feb 18, 2022

>I got a feeling this a like a religous war between people on either side...

i think comparing zfs to ext4 is comparing apples to peas. what's better simply "depends". if you think ext4 is better then zfs, then you favorize ext4 because you don't either know about the advantages of zfs or you simply don't need those...

having religious war between ext4 or zfs is nonsense (the same applies for religious wars in general. regarding religion in general, you can calm down such war very quickly. just use this nice comic https://imgur.com/a/9nFbL4b . but be careful

virtManager · Feb 18, 2022

RolandK said:
when i execute proxmox-boot-tool from inside the chroot installation (with bind mounts) after that error message "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument", i observe something weird.

proxmox-boot-tool is telling that partition has the wrong type and exits.

This is on your system, right? I don't think I can see that in my install.log-file. Can you post your install.log so I can compare mine and yours? But otherwise: Yes, I look very much to hearing what @Stoiko Ivanov has to say about this (and if I/we should fill in a bug report somewhere so it can be fixed), thanks!

Dunuin · Feb 18, 2022

virtManager said:
I must admit that when choosing ZFS over EXT4, I think it's really difficult to figure out which is better because there are many posts saying both this and that. So - I think you're right - neither ZFS nor EXT4 are 100% resilient towards power failures. But still: Isn't ZFS better? At least it was my impression. When I google it (I'll continue to do), I got a feeling this a like a religous war between people on either side... Anyway, I'll be sure to make regular whole-disk backup images so I can restore everything if thing go south... I'll search this forum, to read more on these stories, I don't think I've read enough about this yet, thanks for the info!

ZFS should be more resilient, but just wanted to explain that ZFS won't prevent you from loosing data on an power outage. Software like ZFS can't fix that, for that you need the get the right hadware. Atleast a UPS. And if you want to be even more on the safe side a redundant PSU because a UPS won't help if your unredundant PSU dies. And if you want to be even more safe only use HDDs on raid controllers with a BBU or enterprise SSDs with a build in powerloss protection. Then you will still loose all writecached data in the systems RAM but atleast writecached data inside the raidcontroller/SSD won'T be lost.

virtManager · Feb 18, 2022

Dunuin said:
ZFS should be more resilient, but just wanted to explain that ZFS won't prevent you from loosing data on an power outage. Software like ZFS can't fix that, for that you need the get the right hadware. Atleast a UPS. And if you want to be even more on the safe side a redundant PSU because a UPS won't help if your unredundant PSU dies. And if you want to be even more safe only use HDDs on raid controllers with a BBU or enterprise SSDs with a build in powerloss protection. Then you will still loose all writecached data in the systems RAM but atleast writecached data inside the raidcontroller/SSD won'T be lost.

I completely understand that - but thanks for confirming my idea about this... There's just so much stuff out there on the internet, so but I've found a really good thread here: https://askubuntu.com/questions/442...by-ubuntu-is-the-most-resistant-to-file-corru - in which one answer says:

Full journalled file systems (as well as log-structured, and copy-on-write file systems) can actually ensure the integrity of individual files, successfully rolling them back to their state prior to any failed partial write. ext4 can do this in data=journal mode but that comes with a huge performance penalty. btrfs and ZFS could do it with less performance penalty.

And there we have it: So my understanding (I'm just a complete amateur, anyway, go with me for while): Is that both ext4 & zfs are excellent. But ext4 needs data=journal mode, resulting in a huge performance penalty, whereas zfs does the same, with less performance penalty. Performance does matter to me. Based on this single quote I feel ZFS (and it is/was also my general impression, before reading this, based on reading about ZFS in the past year or so) is better.

So I'll at least experiment with ZFS now that I've solved the main problem - and then I'll evaluate and see how things will be after 6-12-24 months and I cross my fingers, hoping that I don't need to buy an UPS (I'll have image backups so I can survive some disk failures)... I know in theory one should use ECC RAM etc - but I don't want to waste that money on this small home-setup, I can still survive if my disk crashes....

Anyway - let's stop talking about ZFS/EXT4 now and see what @Stoiko Ivanov writes and after that I'll mark the thread as [SOLVED] - thanks a lot all, I just hope proxmox will investigate and fix the ZFS-installer in near future so we avoid having to install 6.4 and upgrading, in order for this to work, thanks!

Dunuin · Feb 18, 2022

Full journalled file systems (as well as log-structured, and copy-on-write file systems) can actually ensure the integrity of individual files, successfully rolling them back to their state prior to any failed partial write.

There are alot of threads here where the complete pool died. If your complete pool is dead because of an power outage it won't help that ZFS is CoW and could roll back to a working snapshot. For that the pool needs to be atleast somehow working. Loose the wrong parts of the disk and a ZFS pool can't be recovered.

RolandK · Feb 18, 2022

sorry, but cross-discussion about zfs in this bugreport makes the whole thing completely unreadable and a pain in the ass for proxmox team to resolve. please stop and/or open a new thread.

back on track:

>Can you post your install.log so I can compare mine and yours?

i have the same error and it looks very similar

>I always get "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument"

i think the lsblk issue from above doesn't apply, as from the install log you can see that the error does not come from proxmox-boot-tool.

I guess indeed, that there is something wrong with writing to efivars on HP thin clients, as this pull request implies

https://github.com/NixOS/nixpkgs/pull/140278
systemd-boot fails to install on some machines (e.g. a HP t620 thin client) due to being unable to write the LoaderSystemToken EFI variable

think we would need falling back to old pre 7.0 installation method (either by automatically detecting or by adding an option or a description how to manually fix/override). or maybe using systemd option

virtManager · Feb 20, 2022

RolandK said:
>I always get "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument"

i think the lsblk issue from above doesn't apply, as from the install log you can see that the error does not come from proxmox-boot-tool.

I'm not so sure about that, how can you see the error doesn't come from proxmox-boot-tool? Correct me if I'm wrong, but I think proxmox-boot-tool exists with an error code 1 or something (at least not 0) and I think this is not permitted. So it falls back to the next line which is # umount /rpool/ROOT/pve-1/dev etc...

RolandK said:
I guess indeed, that there is something wrong with writing to efivars on HP thin clients, as this pull request implies

https://github.com/NixOS/nixpkgs/pull/140278
systemd-boot fails to install on some machines (e.g. a HP t620 thin client) due to being unable to write the LoaderSystemToken EFI variable

think we would need falling back to old pre 7.0 installation method (either by automatically detecting or by adding an option or a description how to manually fix/override). or maybe using systemd option

Yes, thanks a lot for posting that, I actually did see it earlier, I just didn't understood it so I skipped it. But now I got a closer look at the underlying discussion https://github.com/systemd/systemd/commit/351de38e4b4e6ca324346e6dbcefd224bbb3b190
; and I think I understand it a bit more now (still looking very much forward to hearing from @Stoiko Ivanov ):

Apparently some firmwares don't allow us to write this token, and refuse
it with EINVAL. We should normally consider that a fatal error, but not
really in the case of "bootctl random-seed" when called from the
systemd-boot-system-token.service since it's called as "best effort"
service after boot on various systems, and hence we shouldn't fail
loudly.

So what I did now, was to mount both the latest proxmox ve 7.1-2 iso and also the 6.4-1.iso. In both cases I then did unsquashfs pve-installer.squashfs in order to better understand why it works in 6.4 and not in 7.1-2. I compared both versions of the proxmox-boot-tool-script and I don't really understand it because the bootctl --path "$esp_mp" install-line is the same in both cases. In the future I think there should be a switch/case-selection, where --graceful is being added to the bootctl commandline line, something like NixOS does here: https://github.com/NixOS/nixpkgs/pull/140278/files/7bd84b66850100c3ffdec3ee9ef7c27a6a57ce41 - or maybe just always use "--graceful", which based on my limited knowledge sounds easy (not sure if it's acceptable, though for Proxmox dev team). Anyway, I'm confused as to why it works in 6.4 and not in 7.1-2 so I tried digging a bit. Maybe the problem is in the _status_detail() -function of the script...

In 6.4 we have:

Code:

    if [ -f "${mountpoint}/$PMX_LOADER_CONF" ]; then
        result="uefi"
        if [ ! -d "${mountpoint}/$PMX_ESP_DIR" ]; then
            warn "${path}/$PMX_ESP_DIR does not exist"
        fi
    fi

In 7.1-2 we have something a bit more advanced (and it makes sense to me, if the problem is that one of these new lines fail):

Code:

    if [ -f "${mountpoint}/$PMX_LOADER_CONF" ]; then
        if [ ! -d "${mountpoint}/$PMX_ESP_DIR" ]; then
            warn "${path}/$PMX_ESP_DIR does not exist"
        fi
        versions_uefi=$(ls -1 ${mountpoint}/$PMX_ESP_DIR | awk '{printf (NR>1?", ":"") $0}')
        result="uefi (versions: ${versions_uefi})"
    fi

I noticed it's important that no command anywhere must fail or the installation script will abort and skip the remaining steps that it needs. An easy solution is maybe to use the proxmox-boot-tool from 6.4, although I haven't made the experiment to:

try to replace this file in 7.1-2 (with the old from 6.4)
squash and replace the pve-installer.squashfs
make a new iso with this modification
put on USB and try to install 7.1-2 (but with the proxmox-boot-tool from 6.4)

It could maybe be an interesting test to try out? Is there a way in which I can "pause" (maybe just "sleep 999999" in the install-script?) or "hand over control" so I manually can execute the line chroot /rpool/ROOT/pve-1 proxmox-boot-tool init /dev/sda2? This allows me to easier check return code and also it allows me to easily try the --graceful option, wouldn't that be valuable? I'm however just a bit busy these moments, but I think I'll try and do this/a test later after some days (I'll take a full backup image before) and report back (if it makes sense)!

RolandK · Feb 23, 2022

it's possible to workaround the issue with re-applying proxmox-boot-tool after installation, but it's a little bit complex too describe and i need to find out first, why lsblk inside chroot fails on my system (i.e. i cannot use proxmox-boot-tool in chroot after installation, see above). using proxmox-boot-tool in chroot after installation would make life much easier, otherwise you need to copy around things... (kernel/initrd to /boot, /etc/kernel/cmdline from the chroot....)

in proxmox-boot-tool OUTSIDE the chroot i changed

156 bootctl --path "$esp_mp" install

into

156 bootctl --no-variables¶ --path "$esp_mp" install

that way, we won't need --graceful yet and efi is setup correctly that way, but without dedicated boot entry in efi vars.

on my t520, i can select "UEFI: SATA SSD" afterwards, i think bios is detecting that there is efi boot entry on /dev/sda2

whatever, the problem is that bootctl bails out on setup and leaves efi boot in an unworkable state. that error should either be catched/handled or avoided.

Stoiko Ivanov · Feb 23, 2022

sorry for not coming back earlier - to this by now very long thread - I hope not to miss anything - if yes - please notify me

Thanks for the effort you put into researching this!

virtManager said:
Yes, I think the BIOS is buggy... I've disabled secure boot - FURTHERMORE: I can install with ext4 in uefi-mode and it works... So the problem is just ZFS with RAID0...

That should be the result of using grub with ext4 (in bios/uefi) vs systemd-boot for ZFS - as you found out it's probably related to:
https://github.com/systemd/systemd/commit/351de38e4b4e6ca324346e6dbcefd224bbb3b190
as you correctly pointed out

From what I currently think adding '--graceful' unconditionally to the bootctl install invocation in proxmox-boot-tool should not cause any harm
(would need to check/test that a bit more carefully though)

virtManager said:
Anyway, I'm confused as to why it works in 6.4 and not in 7.1-2 so I tried digging a bit. Maybe the problem is in the _status_detail() -function of the script...

That should not be called during the install?

In any case the systemd version changed between 6.4 (buster) and 7 (bullseye) - and I do expect it to contain changes in systemd-boot as well
from: https://github.com/systemd/systemd/blob/main/NEWS
I assume that the whole LoaderSystemToken stuff was added in v243 (buster was v241, bullseye is v247)

for completeness sake this is tracked in https://bugzilla.proxmox.com/show_bug.cgi?id=3729 (thanks @RolandK - I assume you left the last comment there)

RolandK said:
when i execute proxmox-boot-tool from inside the chroot installation (with bind mounts) after that error message "Failed to write 'LoaderSystemToken' EFI variable: Invalid argument", i observe something weird.

proxmox-boot-tool is telling that partition has the wrong type and exits.

in proxmox-boot-tool this information is determined via lsblk - and apparently lsblk doesn't get to these necessary informations, proxmox-boot-tool relies on.

i have no clue why, sfdisk (and also cfdisk) show UUID and PARTTYPE just fine

How exactly did you setup the chroot and enter it? (which bind-mounts)

IIRC you might need to bindmount the host's/installers '/run' inside as well - see https://git.proxmox.com/?p=pve-inst...cf5dd748ac59477f49793b3b0e40bea;hb=HEAD#l1549
or even better the wiki-page:
https://pve.proxmox.com/wiki/ZFS:_S...iring_a_System_Stuck_in_the_GRUB_Rescue_Shell

I recently checked this and it worked

It would be great if you could try to run the installer in debug mode until you get to that failure - and in the shell afterwards try to
* chroot as described in the wiki-page
* quickly edit the bootctl install invocation in `/usr/sbin/proxmox-boot-tool` and add --graceful to it
* run init for the partitions (to see if it indeed fixes your issue)

I hope this helps!

RolandK · Feb 23, 2022

>How exactly did you setup the chroot and enter it? (which bind-mounts)

ah, i was missing /run and that seems to make the difference ! thanks. did not know about rbind, very useful....

@Stoiko Ivanov , fantastic, thanks for the notes/pointers, with that workarounding the issue is very easy!

[SOLVED] 'LoaderSystemToken' EFI variable: Invalid argument - HP t730 thin client, ZFS & UEFI...

Well-Known Member

Attachments

Proxmox Staff Member

Well-Known Member

Famous Member

Distinguished Member

Famous Member

Well-Known Member

Famous Member

Famous Member

Well-Known Member

Famous Member

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Famous Member

Well-Known Member

Famous Member

Proxmox Staff Member

Famous Member

We value your privacy