EFI and TPM removed from VM config when stopped, not when shutdown

Jun 8, 2016
340
64
48
45
Johannesburg, South Africa
We have had good success with the Secure Boot capable EFI disks and TPM v2.0 emulation. Tested on latest no-subscription with Ceph Pacific 16.2.6. Live migrate works with Windows 11 with full disk encryption (BitLocker) and everything works just perfectly as long as one selects the start/shutdown/migrate options. Issuing a stop instruction results in EFI and TPM references being removed from the VM configuration file.

Nice work, looking forward to this landing in the enterprise repo soon!

Code:
[admin@kvm1d ~]# cat /etc/pve/nodes/kvm1d/qemu-server/122.conf > /root/122.conf.backup; cat /etc/pve/nodes/kvm1d/qemu-server/122.conf
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 1
cpu: Westmere,flags=+pcid
efidisk0: rbd_hdd:vm-122-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
ide2: none,media=cdrom
localtime: 1
machine: pc-q35-6.0
memory: 4096
name: lair-temp
net0: virtio=00:16:3e:00:01:12,bridge=vmbr0,tag=1
numa: 1
ostype: win10
protection: 1
scsi0: rbd_hdd:vm-122-disk-0,cache=writeback,discard=on,size=80G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=f45692f6-0e09-48d2-ae74-7ce85f3f3267
sockets: 2
tpmstate0: rbd_hdd:vm-122-disk-2,size=4M,version=v2.0

[admin@kvm1d ~]# rbd showmapped | grep -e namespace -e 122
id  pool     namespace  image          snap  device
10  rbd_hdd             vm-122-disk-1  -     /dev/rbd10

[admin@kvm1d ~]# rbd ls rbd_hdd -l | grep -e NAME -e 122
NAME                                   SIZE     PARENT                            FMT  PROT  LOCK
vm-122-disk-0                           80 GiB                                      2
vm-122-disk-1                            1 MiB                                      2        excl
vm-122-disk-2                            4 MiB                                      2

[admin@kvm1d ~]# qm start 122; sleep 45; qm stop 122; sleep 20;
Requesting HA start for VM 122
Requesting HA stop for VM 122

[admin@kvm1d ~]# rbd showmapped | grep -e namespace -e 122
id  pool     namespace  image          snap  device
10  rbd_hdd             vm-122-disk-1  -     /dev/rbd10

[admin@kvm1d ~]# rbd ls rbd_hdd -l | grep -e NAME -e 122
NAME                                   SIZE     PARENT                            FMT  PROT  LOCK
vm-122-disk-0                           80 GiB                                      2
vm-122-disk-1                            1 MiB                                      2        excl
vm-122-disk-2                            4 MiB                                      2

[admin@kvm1d ~]# diff -uNr /root/122.conf.backup /etc/pve/nodes/kvm1d/qemu-server/122.conf
--- /root/122.conf.backup       2021-10-12 21:52:49.922585883 +0200
+++ /etc/pve/nodes/kvm1d/qemu-server/122.conf   2021-10-12 21:55:49.000000000 +0200
@@ -3,7 +3,6 @@
 boot: order=scsi0;ide2;net0
 cores: 1
 cpu: Westmere,flags=+pcid
-efidisk0: rbd_hdd:vm-122-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
 ide2: none,media=cdrom
 localtime: 1
 machine: pc-q35-6.0
@@ -17,4 +16,3 @@
 scsihw: virtio-scsi-pci
 smbios1: uuid=f45692f6-0e09-48d2-ae74-7ce85f3f3267
 sockets: 2
-tpmstate0: rbd_hdd:vm-122-disk-2,size=4M,version=v2.0


Works perfectly when you use 'shutdown' instead of 'stop':

Code:
[admin@kvm1d ~]# cat /root/122.conf.backup > /etc/pve/nodes/kvm1d/qemu-server/122.conf
[admin@kvm1d ~]# qm start 122; sleep 45; qm shutdown 122; sleep 20;
[admin@kvm1d ~]# kill 1069131
[admin@kvm1d ~]# rbd showmapped | grep -e namespace -e 122;
id  pool     namespace  image          snap  device
10  rbd_hdd             vm-122-disk-1  -     /dev/rbd10
11  rbd_hdd             vm-122-disk-0  -     /dev/rbd11
[admin@kvm1d ~]# rbd ls rbd_hdd -l | grep -e NAME -e 122;
NAME                                   SIZE     PARENT                            FMT  PROT  LOCK
vm-122-disk-0                           80 GiB                                      2        excl
vm-122-disk-1                            1 MiB                                      2        excl
vm-122-disk-2                            4 MiB                                      2
[admin@kvm1d ~]# diff -uNr /root/122.conf.backup /etc/pve/nodes/kvm1d/qemu-server/122.conf
<blank>


Windows 11 with Secure Boot enabled:
1634073264634.png

Destroyed the test Windows 11 system where BitLocker was working, unfortunatley didn't take a snippet from it but it worked flawlessly.
 
Last edited:

Stefan_R

Proxmox Staff Member
Staff member
Jun 4, 2019
1,301
275
88
Vienna
I cannot reproduce the issue you are running into here. Usually we never remove anything from the config file, unless it's an unknown option... could it potentially be that you migrated to a slightly outdated node and then did the "stop" action there? Would be weird, since the TPM worked, but that's the only thing I can think of right now. Is this always reproducible? Any more info on your setup/anything in syslog?
 
Jun 8, 2016
340
64
48
45
Johannesburg, South Africa
I can confirm that this is reproducible at will on a cluster of PVE 7 nodes which are subscribed to the enterprise repositories, where we temporarily added the no-subscription repository to prepare ourselves for vTPM and EFI state disks becoming available on our main production clusters that exclusively use the enterprise repositories.

The problem doesn't occur if we shutdown the guest and the guest responds to this request. If we select 'shutdown' and the OS isn't booted and the guest agent subsequently isn't running it also removes the two lines as I presume that it issues a stop instruction after a timeout. We can however immediately recreate the scenario whenever we issue a stop after starting a VM with blank discs, as shown in the example above...


Checked for updates in the early hours of this morning, herewith the version information:
Code:
[admin@kvm1d ~]# pveversion -v
proxmox-ve: 7.0-2 (running kernel: 5.11.22-3-pve)
pve-manager: 7.0-13 (running version: 7.0-13/7aa7e488)
pve-kernel-helper: 7.1-2
pve-kernel-5.11: 7.0-8
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-3-pve: 5.11.22-7
ceph: 16.2.6-pve2
ceph-fuse: 16.2.6-pve2
corosync: 3.1.5-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36+pve1
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-10
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-3
libpve-storage-perl: 7.0-12
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.0.11-1
proxmox-backup-file-restore: 2.0.11-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-10
pve-docs: 7.0-5
pve-edk2-firmware: 3.20210831-1
pve-firewall: 4.2-4
pve-firmware: 3.3-2
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-16
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
 
Have the VM's been migrated to another host and back or the nodes all restarted? I had the same problems when I updated the packages but the VM was still running under the prior qemu version. Restarting or shutting down the VM within the same node didn't help.
 
Jun 8, 2016
340
64
48
45
Johannesburg, South Africa
That was indeed the problem, restarting the node resolved that issue so I presume a service that should be restarted as part of the package upgrade process... Thought that all PVE 7.0-13 cluster nodes had fenced and reset after network interfaces suddenly changed MTU (logs below), turns out I need to change legacy systems by replacing 'WATCHDOG_MODULE=ipmi_watchdog' with 'WATCHDOG_MODULE=iTCO_wdt' in /etc/default/pve-ha-manager.

We're running OvS with jumbo frames, PVE occassionally resets the MTU size which then causes Corosync to stop communicating, leading to the cluster fencing itself:

Code:
Oct 15 12:26:20 kvm1a pvedaemon[2132829]: <davidh@pam> starting task UPID:kvm1a:0020D364:015185F5:6169574C:qmsnapshot:124:davidh@pam:
Oct 15 12:26:20 kvm1a pvedaemon[2151268]: <davidh@pam> snapshot VM 124: mbr2gpt
Oct 15 12:26:22 kvm1a pvedaemon[2132829]: <davidh@pam> end task UPID:kvm1a:0020D364:015185F5:6169574C:qmsnapshot:124:davidh@pam: OK
Oct 15 12:26:40 kvm1a pvedaemon[2132829]: <davidh@pam> update VM 124: -ide2 shared:iso/win10re-1511-x64-syrex.iso,media=cdrom,size=543390K
Oct 15 12:26:46 kvm1a pvedaemon[2132829]: <davidh@pam> update VM 124: -boot order=ide2;scsi0;net0
Oct 15 12:26:48 kvm1a pvedaemon[2144873]: <davidh@pam> starting task UPID:kvm1a:0020D43C:015190A6:61695768:hastart:124:davidh@pam:
Oct 15 12:26:50 kvm1a pvedaemon[2144873]: <davidh@pam> end task UPID:kvm1a:0020D43C:015190A6:61695768:hastart:124:davidh@pam: OK
Oct 15 12:27:00 kvm1a systemd[1]: Starting Proxmox VE replication runner...
Oct 15 12:27:02 kvm1a systemd[1]: pvesr.service: Succeeded.
Oct 15 12:27:02 kvm1a systemd[1]: Finished Proxmox VE replication runner.
Oct 15 12:27:02 kvm1a systemd[1]: pvesr.service: Consumed 1.302s CPU time.
Oct 15 12:27:03 kvm1a pve-ha-lrm[2151648]: starting service vm:124
Oct 15 12:27:03 kvm1a pve-ha-lrm[2151652]: start VM 124: UPID:kvm1a:0020D4E4:015196CF:61695777:qmstart:124:root@pam:
Oct 15 12:27:03 kvm1a pve-ha-lrm[2151648]: <root@pam> starting task UPID:kvm1a:0020D4E4:015196CF:61695777:qmstart:124:root@pam:
Oct 15 12:27:04 kvm1a kernel: [221237.021281]  rbd7: p1 p2
Oct 15 12:27:04 kvm1a kernel: [221237.059391] rbd: rbd7: capacity 85899345920 features 0x1d
Oct 15 12:27:04 kvm1a systemd[1]: Started 124.scope.
Oct 15 12:27:04 kvm1a systemd-udevd[2151699]: Using default interface naming scheme 'v247'.
Oct 15 12:27:04 kvm1a kernel: [221237.366302] device tap124i0 entered promiscuous mode
Oct 15 12:27:04 kvm1a kernel: [221237.367103] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a systemd-udevd[2151699]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Oct 15 12:27:04 kvm1a kernel: [221237.367365] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.386087] vlan100: dropped over-mtu packet: 4490 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419133] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419144] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419329] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419348] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419368] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419475] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:04 kvm1a kernel: [221237.419619] vlan100: dropped over-mtu packet: 2175 > 1500
Oct 15 12:27:06 kvm1a ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port tap124i0
Oct 15 12:27:06 kvm1a ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl del-port fwln124i0
Oct 15 12:27:06 kvm1a ovs-vsctl: ovs|00002|db_ctl_base|ERR|no port named fwln124i0
Oct 15 12:27:06 kvm1a ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl -- add-port vmbr0 tap124i0 tag=1 vlan_mode=dot1q-tunnel other-config:qinq-ethtype=802.1q
Oct 15 12:27:06 kvm1a pve-ha-lrm[2151648]: <root@pam> end task UPID:kvm1a:0020D4E4:015196CF:61695777:qmstart:124:root@pam: OK
Oct 15 12:27:06 kvm1a pve-ha-lrm[2151648]: service status vm:124 started
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
Oct 15 12:27:06 kvm1a corosync[1998]:   [TOTEM ] Retransmit List: 1e73d5
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!