VM Reboot Issue - VM stuck on Proxmox start boot option screen

complexplaster27 · Sep 25, 2024

Hey All,

So this problem I've been having for a little while now is sometimes I will reboot a Virtual Machine and then it will fully shut down and come back and just get stuck 9/10ths of the way on the start boot option and a lot of times if it's say a Windows update scheduled at 12 am, i'll get in the morning at 8 am and it's still there just sitting like this.

Usually a forceful stop or reset will get it working again but I have no clue why this happens.

Another time a colleague of mine rebooted their alarm server during the day at 11 am that I'm hosting on a Proxmox Ceph cluster and an hour later they said that their server never came back up, logging into the cluster I just saw this same screen and i force reset it and it came back alive.

Here's the output of my pveversion

Code:

pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.12-2-pve)
pve-manager: 8.2.6 (running version: 8.2.6/414ce79a1d42d6bc)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-2
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
ceph: 18.2.2-pve1
ceph-fuse: 18.2.2-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx9
intel-microcode: 3.20231114.1~deb12u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.2
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.10
libpve-storage-perl: 8.2.4
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-4
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.5.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.2.0
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.0.7
pve-firmware: 3.13-2
pve-ha-manager: 4.0.5
pve-i18n: 3.2.3
pve-qemu-kvm: 9.0.2-3
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1

Code:

ceph status
  cluster:
    id:     5fd31f6c-3f31-4fe2-bcc5-1f73aa608f8f
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum A,B,C (age 44h)
    mgr: A(active, since 44h), standbys: B, C
    osd: 9 osds: 9 up (since 44h), 9 in (since 6w)
 
  data:
    pools:   2 pools, 129 pgs
    objects: 335.86k objects, 1.3 TiB
    usage:   3.7 TiB used, 4.2 TiB / 7.9 TiB avail
    pgs:     129 active+clean
 
  io:
    client:   304 KiB/s rd, 432 KiB/s wr, 33 op/s rd, 48 op/s wr

And lastly the specific VM details:

Code:

cat /etc/pve/qemu-server/103.conf
agent: 1
bios: ovmf
boot: order=scsi0;ide0;net0;scsi1
cores: 4
cpu: host
efidisk0: cluster-storage:vm-103-disk-2,efitype=4m,pre-enrolled-keys=1,size=528K
machine: pc-q35-8.1
memory: 16384
meta: creation-qemu=8.1.2,ctime=1704766310
name: 103
net0: virtio=BC:24:11:8C:5A:75,bridge=vmbr0,firewall=1
numa: 1
onboot: 1
ostype: win10
scsi0: cluster-storage:vm-103-disk-0,cache=writeback,discard=on,iothread=1,serial='C-drive',size=60G,ssd=1
scsi1: cluster-storage:vm-103-disk-4,cache=writeback,discard=on,iothread=1,serial='D-drive',size=150G,ssd=1
scsi2: cluster-storage:vm-103-disk-3,cache=writeback,discard=on,iothread=1,serial='E-drive',size=150G,ssd=1
scsi3: cluster-storage:vm-103-disk-5,cache=writeback,discard=on,iothread=1,serial='F-drive',size=210G,ssd=1
scsi4: cluster-storage:vm-103-disk-6,cache=writeback,discard=on,iothread=1,serial='G-drive',size=140G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=66c36bf2-6ee8-43d5-b62c-a34c93befbc8
sockets: 2
vmgenid: 215fc570-a1b1-402f-8c93-09eb20805feb

I've removed any information that is considered sensitive from these. I'm welcome to any suggestions and potential troubleshooting steps.

complexplaster27 · Sep 25, 2024

Another thing I can add to this is sometimes when I reset or stop the VM and start it back up it will just sit on this screen

After around 2 or so minutes it comes right and boots correctly.

Darkbotic · Sep 30, 2024

This is also happening to me on my Windows VMs.
It looks like it only happens on VM's that use EFI/OVMF boot.
Any ideas?

itNGO · Sep 30, 2024

Check if it helps to use older Machine-Type...

complexplaster27 · Oct 1, 2024

Darkbotic said:
This is also happening to me on my Windows VMs.
It looks like it only happens on VM's that use EFI/OVMF boot.
Any ideas?

Yeah I agree with this sentiment, none of my BIOS Boot machines have this issue, only seems to be EFI based VMs.

itNGO said:
Check if it helps to use older Machine-Type...
View attachment 75442

For me I have a fair number of Virtual Machines that all range from Version 7.0 to the latest which is 9.0 depending on when the VM was installed. I don't seem to notice any difference as they all at some point got stuck at the boot screen.

The thing is, I don't know how to replicate it, it just happens sometimes and I deal with it on a case by case basis, would be nice to know how to replicate the issue.

Darkbotic · Oct 1, 2024

complexplaster27 said:
The thing is, I don't know how to replicate it, it just happens sometimes and I deal with it on a case by case basis, would be nice to know how to replicate the issue.

Same here. This is frustrating because you don't know when it will happen but it happens when you don't want it to happen, especially with Windows Server. I will try to find a way to monitor the VM and get a notification when it doesn't boot properly. I think ping doesn't work when it's stuck like that so that might be a good way of monitoring it and maybe even automate a restart when it happens.

Darkbotic said:
This is also happening to me on my Windows VMs.
It looks like it only happens on VM's that use EFI/OVMF boot.
Any ideas?

Just like @complexplaster27 said, my VM's have different versions of the pc-q35 (7 to 9) and it happens randomly to all of them every once in a while.

complexplaster27 · Oct 17, 2024

Bumping this thread, issue is still happening, even on freshly created VMs after a reboot. Would love some further guidance on how to troubleshoot this.

itNGO · Oct 17, 2024

Try to delete and recreate your EFI-Partition... this might help in case there is an issue with the "secure-boot" config in Windows....

Darkbotic · Oct 17, 2024

itNGO said:
Try to delete and recreate your EFI-Partition... this might help in case there is an issue with the "secure-boot" config in Windows....

That's not it. Among the things I have done, that's one of them. It didn't work.

carles89 · Oct 18, 2024

Same here, fortunately it is happening on a virtualized test cluster.

I've restored a Windows 2019 VM from a backup to local-zfs, it booted the first time, moved it to Ceph storage and from the second boot onwards it got stuck at "Start boot option...", with VM's CPU at 100%.

Just to make sure, I've restored it again to local-zfs, booted it (first boot OK), stopped it, booted it again and got stuck at "Start boot option...".

That's a weird issue...

Code:

root@pve02:~# cat /etc/pve/qemu-server/104.conf 
#Windows 2019
agent: 1
bios: ovmf
boot: order=scsi0;ide0;net0
cores: 2
cpu: host
efidisk0: vmstorage-local:vm-104-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
ide0: none,media=cdrom
machine: pc-q35-8.1
memory: 2048
meta: creation-qemu=8.1.5,ctime=1711640555
name: Windows2022-b
net0: virtio=BC:24:11:3A:69:A5,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: vmstorage-local:vm-104-disk-1,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=851361f8-c636-4aac-ba08-07bb506809b3
sockets: 1
tpmstate0: vmstorage-local:vm-104-disk-2,size=4M,version=v2.0
vmgenid: cfb90c14-7df0-4c33-9f30-9fc8e6cd325c

Code:

proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
proxmox-kernel-6.5: 6.5.13-5
proxmox-kernel-6.5.13-3-pve-signed: 6.5.13-3
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
ceph: 18.2.2-pve1
ceph-fuse: 18.2.2-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
openvswitch-switch: 3.1.0-2+deb12u1
proxmox-backup-client: 3.2.2-1
proxmox-backup-file-restore: 3.2.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.6
pve-container: 5.0.11
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.6
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2

carles89 · Oct 18, 2024

As @itNGO said, I've recreated EFI disk and it worked. I've also tried to recreate TPM state, but with EFI disk was enough.

The only thing that bothers me is why the VM boots one time after restore from backup and then refuses to boot...

complexplaster27 · Oct 25, 2024

carles89 said:
As @itNGO said, I've recreated EFI disk and it worked. I've also tried to recreate TPM state, but with EFI disk was enough.

The only thing that bothers me is why the VM boots one time after restore from backup and then refuses to boot...

I wonder why this would work, would that mean over 4 different clusters I would need to re-create the EFI disk for all the Virtual Machines? Seems a bit strange to have to do that, @Max Carrara could you advise if this is the best way to work around this?

Darkbotic · Oct 25, 2024

I recreated the EFI disk on all my VM's and it looked like it fixed it but about a week later, it happened again.
The thing is that since it happens rarely, you think it's fixed... Until it happens again.

tgx · Nov 1, 2024

Also seeing this issue. Following.
I will say this, I noticed this happen after I was fussing with the ostype. By default when I restored this VM it sets type to 'other' even though it is Windows 10. So I had shut the vm down and switched it to win10 and tried to boot it. After that it was a total loss. Just hangs at same spot as original poster. My VM details:

agent: 1
audio0: device=ich9-intel-hda,driver=none
bios: ovmf
boot: order=sata0;net0;ide2
cores: 2
efidisk0: datastore1:104/vm-104-disk-0.raw,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: Mirror:iso/virtio-win-0.1.262.iso,media=cdrom,size=708140K
localtime: 1
machine: pc,viommu=virtio
memory: 12288
meta: creation-qemu=9.0.2,ctime=1730478599
name: robot
net0: virtio=BC:24:11:19:2E:81,bridge=vmbr0,firewall=1
sata0: datastore1:104/vm-104-disk-1.raw,size=60G
smbios1: uuid=be311465-e553-43ae-bd9e-ba99bca81130
sockets: 2
vga: std
vmgenid: 6e2b63ab-7db6-49f8-9090-942cafff3f0a

tgx · Nov 4, 2024

Just thought I would drop a note and say that restoring from a backup taken before I tweaked the OS from 'other' to 'Win 10', has worked and the VM is now usable again. I have no idea what that tweak did to disable the VM but even switching it back did not correct the boot problem.

complexplaster27 · Nov 4, 2024

Would be nice for a Proxmox official staff member to chime in, it doesn't seem like we have a solution for this problem and it may potentially affect any user.

Search

Search

VM Reboot Issue - VM stuck on Proxmox start boot option screen

complexplaster27

New Member

complexplaster27

New Member

Darkbotic

Member

itNGO

Renowned Member

complexplaster27

New Member

Darkbotic

Member

complexplaster27

New Member

itNGO

Renowned Member

Darkbotic

Member

carles89

Renowned Member

carles89

Renowned Member

complexplaster27

New Member

Darkbotic

Member

tgx

New Member

tgx

New Member

complexplaster27

New Member