Stuck at EFI boot during reboot of VM

modem7

Member
Nov 2, 2021
35
2
13
38
Hey guys,

Intermittently, my linux VM's will get stuck here:

1678314348458.png

pveversion -v
Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-5
pve-kernel-5.15: 7.3-2
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
ceph-fuse: 15.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

qm config:
Code:
agent: enabled=1,fstrim_cloned_disks=1
balloon: 768
bios: ovmf
boot: c
bootdisk: scsi0
cipassword: **********
ciuser: modem7
cores: 4
cpu: host
description: When modifying this template, make sure you run this at the end%0A%0Aapt-get clean /\%0A&& apt -y autoremove --purge /\%0A&& apt -y clean /\%0A&& apt -y autoclean /\%0A&& cloud-init clean /\%0A&& >/etc/machine-id /\%0A&& sync /\%0A&& history -c /\%0A&& history -w /\%0A&& shutdown now%0A%0ADetails%3A%0ADisabled UEFI SecureBoot%0ASnap removed%0AFSTrim + timer enabled%0Aswapfile dynamic with swapfile package%0ATuned w/ virtual-guest profile%0A%0AInstalled packages%3A%0Aacl%0Aaptitude%0Acloud-guest-utils%0Acloud-init%0Acloud-utils%0Acurl%0Adnsutils%0Agit%0Ahtop%0Alibsasl2-modules%0Amlocate%0Aneedrestart%0Anet-tools%0Aqemu-guest-agent%0Aresolvconf%0Asudo%0Aswapspace%0Atldr%0Atuned%0Atuned-utils%0Atuned-utils-systemtap%0Aunattended-upgrades%0Aunzip
efidisk0: Proxmox:1113/vm-1113-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ipconfig0: ip=192.168.50.254/24,gw=192.168.50.1
machine: q35
memory: 8192
meta: creation-qemu=7.1.0,ctime=1676917435
name: HDA-Prep
net0: virtio=3E:DD:BD:29:8D:D9,bridge=vmbr1,queues=4,tag=50
numa: 0
ostype: l26
rng0: source=/dev/urandom
scsi0: Proxmox:1113/vm-1113-disk-1.qcow2,cache=writethrough,discard=on,iothread=1,size=20G,ssd=1
scsi1: Proxmox:1113/vm-1113-cloudinit.qcow2,media=cdrom,size=4M
scsi2: Proxmox:1113/vm-1113-disk-2.qcow2,discard=on,iothread=1,size=30G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=712512e1-5f81-4ccb-b07f-724a8d3b4d47
sockets: 1
sshkeys: ssh-ed25519%20AAAAC3NzaC1lZDI1NTE5AAAAIOFLnUCnFyoONBwVMs1Gj4EqERx%2BPc81dyhF6IuF26WM%20proxvms%0A
tags: templates
vmgenid: a1a8c410-c9e3-4fff-8ca3-1c8e95fa8119
watchdog: model=i6300esb,action=reset

A couple of the VM's have watchdog, but the majority do not.
But otherwise, they're all roughly similar.

Resetting the VM once or twice usually resolves until the next time.
 
Nope, even with the above, just had two VM's do exactly the same thing.

Anyone have any ideas on this?
 
I am seeing same thing on a new server with raptor lake cpu. i installed pve-kernel-6.2 but no luck. I do have sata controller in passthrough to vm, not sure if that causes an issue. For me, issue started only after moving to 6.0+ kernel.
 
Hi,
Hey guys,

Intermittently, my linux VM's will get stuck here:

View attachment 47710

pveversion -v
Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-5
pve-kernel-5.15: 7.3-2
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
ceph-fuse: 15.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1

qm config:
Code:
agent: enabled=1,fstrim_cloned_disks=1
balloon: 768
bios: ovmf
boot: c
bootdisk: scsi0
cipassword: **********
ciuser: modem7
cores: 4
cpu: host
description: When modifying this template, make sure you run this at the end%0A%0Aapt-get clean /\%0A&& apt -y autoremove --purge /\%0A&& apt -y clean /\%0A&& apt -y autoclean /\%0A&& cloud-init clean /\%0A&& >/etc/machine-id /\%0A&& sync /\%0A&& history -c /\%0A&& history -w /\%0A&& shutdown now%0A%0ADetails%3A%0ADisabled UEFI SecureBoot%0ASnap removed%0AFSTrim + timer enabled%0Aswapfile dynamic with swapfile package%0ATuned w/ virtual-guest profile%0A%0AInstalled packages%3A%0Aacl%0Aaptitude%0Acloud-guest-utils%0Acloud-init%0Acloud-utils%0Acurl%0Adnsutils%0Agit%0Ahtop%0Alibsasl2-modules%0Amlocate%0Aneedrestart%0Anet-tools%0Aqemu-guest-agent%0Aresolvconf%0Asudo%0Aswapspace%0Atldr%0Atuned%0Atuned-utils%0Atuned-utils-systemtap%0Aunattended-upgrades%0Aunzip
efidisk0: Proxmox:1113/vm-1113-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ipconfig0: ip=192.168.50.254/24,gw=192.168.50.1
machine: q35
memory: 8192
meta: creation-qemu=7.1.0,ctime=1676917435
name: HDA-Prep
net0: virtio=3E:DD:BD:29:8D:D9,bridge=vmbr1,queues=4,tag=50
numa: 0
ostype: l26
rng0: source=/dev/urandom
scsi0: Proxmox:1113/vm-1113-disk-1.qcow2,cache=writethrough,discard=on,iothread=1,size=20G,ssd=1
scsi1: Proxmox:1113/vm-1113-cloudinit.qcow2,media=cdrom,size=4M
scsi2: Proxmox:1113/vm-1113-disk-2.qcow2,discard=on,iothread=1,size=30G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=712512e1-5f81-4ccb-b07f-724a8d3b4d47
sockets: 1
sshkeys: ssh-ed25519%20AAAAC3NzaC1lZDI1NTE5AAAAIOFLnUCnFyoONBwVMs1Gj4EqERx%2BPc81dyhF6IuF26WM%20proxvms%0A
tags: templates
vmgenid: a1a8c410-c9e3-4fff-8ca3-1c8e95fa8119
watchdog: model=i6300esb,action=reset

A couple of the VM's have watchdog, but the majority do not.
But otherwise, they're all roughly similar.

Resetting the VM once or twice usually resolves until the next time.
can you try upgrading to the latest version (in particular pve-edk2-firmware=3.20230228-1) and see if the issue persists?

I am seeing same thing on a new server with raptor lake cpu. i installed pve-kernel-6.2 but no luck. I do have sata controller in passthrough to vm, not sure if that causes an issue. For me, issue started only after moving to 6.0+ kernel.
You could try to boot an older kernel to verify that it's actually a kernel issue in your case. What is the output of pveversion -v?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!