Mortar (Secureboot/LUKS Framework) broken with PVE kernel >= 5.19

noahbliss

Member
Mar 6, 2020
7
3
8
30
Hey guys, I am the dev behind Mortar, a framework for linking secureboot with LUKS to create fully encrypted, auto unlocking hypervisors. Proxmox support was the original target for this framework but it has since grown. My users and I have discovered that things simply don't work with PVE kernels greater than 5.19. The issue seems to be specific to PVE kernels, unmodified Debian kernels have no issue. I have tried looking in the bugzilla/git for relevant changes but have not found any. If we could get some help here, I would be grateful as some of your paying customers use this in production.

Here is our code: https://github.com/noahbliss/mortar

I am happy to answer any questions about the mechanisms used to see if we can troubleshoot. Thanks!
 
  • Like
Reactions: v_k and peterLustig
Hey, I discovered this error yesterday while updating from PVE7 to PVE8, first there were problems because of missing packages with systemd, but after a bit of fixing, it seems to work. But after my reboot I only got the secure boot failed error message in the booting process.
I needed to disable secure boot to get proxmox to work again :(
Now I always need to enter my password instead of using the TPM.
Would be great if this could be fixed.
 
Can a Proxmox dev look at this problem please ?
The Mortar framework works with the default Debian kernel
 
I'll take a quick look to see if anything obvious is amiss, but I likely won't have the time for in-depth debugging of issues with such third-party integrations. if you have a concrete cutoff point (kernel version wise) where it stopped working, that might also help narrowing it down.
 
I'll take a quick look to see if anything obvious is amiss, but I likely won't have the time for in-depth debugging of issues with such third-party integrations. if you have a concrete cutoff point (kernel version wise) where it stopped working, that might also help narrowing it down.
5.15.74-1-pve just works fine, and am able to boot this kernel release without any problems in secure mode. 5.15.83-1-pve will hang and require a power cycle to get back into the boot prompt to select another kernel.
 
  • Like
Reactions: fabian
just did a quick test run:
- created a VM with UEFI/OVMF + TPM + secure boot on
- followed the instructions (but with Bookworm/PVE 8.x)
-- step 'We will now reboot to see if the system still behaves.' is broken if you already have secure boot enforced (since the mortar efi is not trusted by the system for obvious reasons ;))
-- enrolling from the system doesn't work, but it does via the OVMF menu
- all boots (even with stock Debian) via mortar show the following when booting, before entering the passphrase:

1701856623535.png








-- that might be an artifact of the software TPM though? or mortar not supporting bookworm's TPM tools properly? the pcrlist binary also doesn't exist (anymore?), there is a tpm2_pcrread though that seems to serve the same purpose..
- booting Debian works (with the above caveat)
- setting up the PVE repos, updating from there (grub+shim), rebooting still works
- installing the PVE kernel (6.5.11-6-ve), rebooting still works
- installing all of PVE, rebooting still works

maybe you ran into the the limitation noted in our Secure Boot Setup article:
Up to kernel version 6.2.16-7, the Proxmox VE kernel was not out of the box Secure Boot friendly because it did not sign kernel modules at build time, and to get it to boot one had to manually sign all the modules with a DB key after every kernel upgrade.
 
Hey @fabian thanks for the traction.EDIT: I was able to reproduce your tpm2_flushcontext issue. It's an actual bug in mortar stemming from clevis updates it seems. This should be fixed now. To get the fix you can run:

Code:
git pull
./3-tpm2clevis-prepluksandinstallhooks.sh
update-initramfs -u
mortar-compilesigninstall
#reboot

Can you confirm in the short term though, that your test system has the mortar keys enrolled and exclusively enforced, secureboot is enabled, and the kernel 6.5.11-6-ve allows the system to boot (even if not automatically unlock)? I would also be curious (if you can test) if pve kernel version 6.1 for example also boots in your environment.

I suspect you may be correct in the limitation you referenced, but I am a bit puzzled as to why versions up to 5.15.74-1-pve did work ok. Did someone perhaps turn kernel lockdown mode back on or something? The behavior I experienced would cause the system to freeze entirely with a black screen, behavior typical of strict signing enforcement being enabled and failing.

Separately: I was able to get the OS to self-enroll keys, perhaps we picked different machine options. I am running (on proxmox):
bios: OVMF
machine: q35
tpm: 2.0
"enroll keys" as unchecked.

Defaults for the rest/as typical for a debian vm.
 
Last edited:
Huh! Sure enough I think you're right @fabian I just tested with 6.5.11-6-pve and it works! It would still be interesting to know what changed in 5.15.74-1-pve to cause this issue in the first place, but if the main takeaway is "it's fixed and will stay fixed", I think the little mortar community will be very happy.

Thanks a ton!
 
Hey @fabian thanks for the traction.EDIT: I was able to reproduce your tpm2_flushcontext issue. It's an actual bug in mortar stemming from clevis updates it seems. This should be fixed now. To get the fix you can run:

Code:
git pull
./3-tpm2clevis-prepluksandinstallhooks.sh
update-initramfs -u
mortar-compilesigninstall
#reboot
yes, it now works *with* automated unlock :) (although it started to fail again - with a different error - when I did the tests with different kernel versions below - although that might have been me doing something wrong/missing a step)
Can you confirm in the short term though, that your test system has the mortar keys enrolled and exclusively enforced, secureboot is enabled, and the kernel 6.5.11-6-ve allows the system to boot (even if not automatically unlock)? I would also be curious (if you can test) if pve kernel version 6.1 for example also boots in your environment.

I suspect you may be correct in the limitation you referenced, but I am a bit puzzled as to why versions up to 5.15.74-1-pve did work ok. Did someone perhaps turn kernel lockdown mode back on or something? The behavior I experienced would cause the system to freeze entirely with a black screen, behavior typical of strict signing enforcement being enabled and failing.
6.2.16-19: works
6.1.15-1: doesn't (expected, this kernel doesn't work in any case of "secure boot enabled" scenario unless you also manually signed all the modules and regenerated the initrd)
5.15.74-1: works (with manual unlock?)
5.15.83-1: works (with manual unlock?)
5.15.131-1: works (with manual unlock?)

I didn't test any other versions ;) rerunning the ./3... script and updating the initramfs explicitly then failed header validation, but auto unlocked after the 10s countdown ;)

the bullseye/7.x kernels were tested on a bookworm/8.x system though, so maybe there is a behaviour difference somewhere between the two that would make the 5.15 kernels fail on bullseye.
Separately: I was able to get the OS to self-enroll keys, perhaps we picked different machine options. I am running (on proxmox):
bios: OVMF
machine: q35
tpm: 2.0
"enroll keys" as unchecked.
yes - enroll keys just means "configure the system with Microsoft keys", like a "normal" off the shelf system would likely be. without that, no keys are loaded at all, and you can enroll them from within your OS since the VM is not yet restricted.
 
Hello, yesterday I've run into a very strange problem, possibly these were two completely unrelated events but still you should know about them.
The problem is: dist-upgrade Proxmox 7 to Proxmox 8 completely broke the server's BIOS, I think it could have been Mortar hooks that wrote some incorrect data into the NVRAM.
The second issue is Proxmox's Linux kernel version 6.5 does not boot on Dell R240, but this might be unrelated to the main problem.

Here is the full story:
I have upgraded Proxmox 7 to Proxmox 8 and after a reboot the server did not boot up, because its BIOS got broken, reporting some problem with DXE.

dxe.png

The automated BIOS recovery did not succeed.

bios.png


After the datacenter engineers updated the BIOS to the latest version (and cleared the NVRAM) I've rebooted the server again but the OS did not boot, kernel version 6.5.13-1-pve hangs on loading the initial RAM disk.
This is what it looked like when I was trying to boot the server via GRUB:

6.5 grub.png

However the older kernels booted well, for example 5.15.83-1-pve or 5.15.143-1-pve.

After I've determined that the old kernels boot well, I've started to set up Mortar from scratch: disabled the Secure Boot in BIOS settings, cleared the default Microsoft/Dell/whatever keys, cleared the TPM, set up my own Secure Boot keys with Mortar again and tried to boot the Proxmox 8 with the latest kernel again but the boot hanged again, on loading the initial RAM disk.

This is what it looked like when I was trying to boot a signed EFI image:

6.5 efi.png

After messing with Secure Boot and TPM for the whole evening, clearing the TPM keys, (re)setting the Secure Boot and (re)signing kernels, I finally gave up with Mortar as I wasn't able to boot even older kernels, and I have set up the Secure Boot following the Proxmox manual: https://pve.proxmox.com/wiki/Secure_Boot_Setup#Setup_instructions_for_db_key_variant

I have installed and signed the Linux kernel version 6.5.13-1-pve, but the server would not boot again, the boot process hanged on loading the initial RAM disk. So it seems that the Proxmox's build of Linux 6.5.13-1-pve is incompatible with Dell R240 server.

I have installed the kernel version 6.2.16-20-pve, signed it and the server booted well.

Now I get this every boot:

NV.png

Do not mind the "header validation succeeded" and "TPM validation failed" messages - I have removed the Mortar scripts from "/etc/kernel/postinst.d" but forgot to remove it from "/etc/initramfs-tools/scripts/local-top".
The most important message here is something about NVRAM, after I saw it I've immediately recalled the initial problem with BIOS after I've dist-upgraded Proxmox7 to 8.
Did the dist-upgrade Mortar hooks break something in the NVRAM so BIOS got completely broken?
My server has TPM version 1.2, possibly this was the reason.

I could not mess further with this server as it is a production server rented from a remote datacenter, however I have another Dell R240 locally and could play with it once I have some spare time. Please share your thoughts and suggestions on what I should check on the local server.
Unfortunately I have TPM version 2.0 in the local server, will try to buy and install TPM 1.2 module for an experiment.

If anyone else has a Dell R240 server then please check if you could successfully boot Linux kernel version 6.5.13-1-pve.
 
Last edited:
If anyone else has a Dell R240 server then please check if you could successfully boot Linux kernel version 6.5.13-1-pve.
I have tested on another R240 and confirm that Proxmox 8.1-2 hangs on loading initrd.

r240 proxmox8.1-2.png

I don't, but maybe this will help.
Hi guys, I found that if you disable x2APIC in BIOS under processor configuration it works fine with the new kernel. It seems that the problem is related with how the new kernel handles multi threading but because R240 is a one cpu server this option is not needed for me at least.
And also I confirm that disabling X2APIC mode in BIOS fixes that issue, thanks!

proxmox8 x2apic.png
 
Unfortunately I have TPM version 2.0 in the local server, will try to buy and install TPM 1.2 module for an experiment.
I have found a TPM 1.2 module in another server where I don't need any TPM so I have moved that module into the testing R240, however the server does not accept that foreign module.

tpm1.2.png

Googling for "TPM cannot be used on this platform" revealed that Dell forbids manual replacement of TPM modules.
Do not attempt to remove the Trusted Platform Module (TPM) from the system board. Once the TPM is installed, it is cryptographically bound to that specific system board. Any attempt to remove an installed TPM breaks the cryptographic binding, and it cannot be re-installed or installed on another system board.
If a server comes from the factory with TPM preinstalled then it will be impossible to either upgrade from TPM 1.2 to TPM 2.0 or to downgrade from TPM 2.0 to TPM 1.2:
https://www.dell.com/community/en/c...sed-on-this-platform/647f7cb3f4ccf8a8deb0db58 - the official forum suggests to replace the whole motherboard, LOL.
Thank you, Dell!

However I think that installing a brand new TPM 1.2 module might be possible, even if the motherboard had 2.0 installed before. I will check with the local Dell representative and probably will buy one module for test.

Meanwhile I am looking forward for your thoughts about BIOS being broken by a system upgrade.
 
Last edited:
Thank you, Dell!
Yes I understand your sarcasm, been there before in similar situations. However in this case I actually side with the vendor. Look at your current-use-example: your using TPM to digitally unlock the disks/system - I like the (rare?) fact that the TPM can't just be swapped out for a different one.
 
Yes I understand your sarcasm, been there before in similar situations. However in this case I actually side with the vendor. Look at your current-use-example: your using TPM to digitally unlock the disks/system - I like the (rare?) fact that the TPM can't just be swapped out for a different one.
My threat model is a part-time cleaner stealing hot swap drives, rather than some sophisticated entity swapping TPM cards.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!