Windows VMs stuck on boot after Proxmox Upgrade to 7.0

Jan 7, 2022
10
4
3
  • Like
Reactions: weehooey

dea

Well-Known Member
Feb 6, 2009
159
48
48
Could be, but since we have never seen this issue on our 6.4 cluster and we've only seen this on our 7.1 cluster - Im not so sure.
No, 99% that can't be the problem. The reasons for this sentence are in the posts I wrote previously. I don't value Windows (and I never did) just as I don't value Microsoft's policies. But in this case a correlation of low-level events is evident. With Proxmox 6.4 latest version, Windows 2012 2016 and 2019 work and update perfectly, never miss a beat. With Proxmox 7.x there are always problems, perfectly repeatable. I will be wrong but there must be a concomitance of low-level factors. Kernel 5.13 and 5.15 and / or Qemu 6.x. Along with Windows (which takes its own).
 
Last edited:
Mar 11, 2019
35
5
13
52
Canada
www.weehooey.com
A real annoying problem with Windows.

Microsoft have this problem on their own systems too. So either they run Proxmox VE on Azure (I assume not, but who knows ...) or the issue is probably not related to the Proxmox stack.

Read more on https://docs.microsoft.com/en-us/tr...achines/troubleshoot-vm-boot-configure-update
@tom No one that we are aware of has reported "Getting Windows Ready". That is a different problem.

Has anyone reviewed the information we provided that was requested by @Moayad ?
 
Mar 11, 2019
35
5
13
52
Canada
www.weehooey.com
@Moayad Here is the same data from another hung VM. It is the "classic" black screen, Windows logo and endlessly spinning balls.

Power off (Stop) the VM, power on and it started normally.
 

Attachments

  • 113.png
    113.png
    1.5 KB · Views: 6
  • gdb_output_113.txt
    2.6 KB · Views: 6
  • strace_output_113.tar.gz
    61.9 KB · Views: 2

Moayad

Proxmox Staff Member
Staff member
Jan 2, 2020
1,675
131
68
29
Vienna
shop.maurer-it.com
Hi,

Thank you for the output!
@Moayad Here is the same data from another hung VM. It is the "classic" black screen, Windows logo and endlessly spinning balls.
I already reviewed the provided information, I'm still troubleshooting.
 
  • Like
Reactions: dea and weehooey
May 26, 2022
2
3
3
We can confirm we are having these same issues on 4 of our 6 clusters, all running proxmox 7 with kernels 5.15.30 or newer. The ones running proxmox 6 do not have this issue. They lock up in varying stages of preboot (black screen, guest init, proxmox spinning wheel). Since it's inconsistent we haven't been able to nail down reproducibility, but for us it does seem that reboots only lock after a proxmox backup (sometimes), and more likely if you reboot a bunch around the same time (IO/CPU load?). Having virtio drivers installed/not installed, qemu agent installed/not installed has not made a difference for us, fwiw.
 
  • Like
Reactions: weehooey

dea

Well-Known Member
Feb 6, 2009
159
48
48
We can confirm we are having these same issues on 4 of our 6 clusters, all running proxmox 7 with kernels 5.15.30 or newer. The ones running proxmox 6 do not have this issue. They lock up in varying stages of preboot (black screen, guest init, proxmox spinning wheel). Since it's inconsistent we haven't been able to nail down reproducibility, but for us it does seem that reboots only lock after a proxmox backup (sometimes), and more likely if you reboot a bunch around the same time (IO/CPU load?). Having virtio drivers installed/not installed, qemu agent installed/not installed has not made a difference for us, fwiw.
... for this reason I am reasonably convinced that the problem is Qemu 6.x.
@Moayad, is it possible to create a pure testing package with Qemu 7.0, or with a 5.x version? Try tapping one variable at a time to identify the problem (if it fails to identify through debugging). IMHO
 
  • Like
Reactions: weehooey
Mar 11, 2019
35
5
13
52
Canada
www.weehooey.com
We can confirm we are having these same issues on 4 of our 6 clusters, all running proxmox 7 with kernels 5.15.30 or newer. The ones running proxmox 6 do not have this issue. They lock up in varying stages of preboot (black screen, guest init, proxmox spinning wheel). Since it's inconsistent we haven't been able to nail down reproducibility, but for us it does seem that reboots only lock after a proxmox backup (sometimes), and more likely if you reboot a bunch around the same time (IO/CPU load?). Having virtio drivers installed/not installed, qemu agent installed/not installed has not made a difference for us, fwiw.
@andrewrf for all the VMs that have hung, are you seeing the following:
  • Only Windows VMs, Server 2012 or later
  • Windows does not report anything in bootlog (ntbtlog.txt) -- meaning it had not got far enough in booting to start that logging
  • Only hangs on a reboot and only if the VM has been running for a while (ie not on full power off and power back on)
  • To resolve you have to hard power off VM and power back on. VMs always boots up fine after the hard power off.
  • You are using one of these storage types for the VM: Ceph, ZFS and NFS
  • All the VMs were built before and upgraded from 6.x to 7.x
Have you had any of your VMs do this twice? We have not had the problem more than once with any one VM.

You may also want to get on the CC for the bug report: https://bugzilla.proxmox.com/show_bug.cgi?id=3933
 
May 26, 2022
2
3
3
@andrewrf for all the VMs that have hung, are you seeing the following:
  • Only Windows VMs, Server 2012 or later
  • Windows does not report anything in bootlog (ntbtlog.txt) -- meaning it had not got far enough in booting to start that logging
  • Only hangs on a reboot and only if the VM has been running for a while (ie not on full power off and power back on)
  • To resolve you have to hard power off VM and power back on. VMs always boots up fine after the hard power off.
  • You are using one of these storage types for the VM: Ceph, ZFS and NFS
  • All the VMs were built before and upgraded from 6.x to 7.x
    Have you had any of your VMs do this twice? We have not had the problem more than once with any one VM.

    You may also want to get on the CC for the bug report: https://bugzilla.proxmox.com/show_bug.cgi?id=3933
all the VMs that have hung, are you seeing the following:
  • Only Windows VMs, Server 2012 or later
    • YES, 2016 or later, and Win10 or later
  • Windows does not report anything in bootlog (ntbtlog.txt) -- meaning it had not got far enough in booting to start that logging
    • UNKNOWN (I think that file only gets created if it try's to boot to safe mode, which ours aren't)
  • Only hangs on a reboot and only if the VM has been running for a while (ie not on full power off and power back on)
    • YES
  • To resolve you have to hard power off VM and power back on. VMs always boots up fine after the hard power off.
    • YES
  • You are using one of these storage types for the VM: Ceph, ZFS and NFS
    • YES, CEPH
  • All the VMs were built before and upgraded from 6.x to 7.x
    • NO, we have this issue on vms that were built in 7.x as well, and some under 2 months old
  • Have you had any of your VMs do this twice? We have not had the problem more than once with any one VM.
    • YES, we have had this issue on the same vm more than once
 
Last edited:

dea

Well-Known Member
Feb 6, 2009
159
48
48
all the VMs that have hung, are you seeing the following:
  • Only Windows VMs, Server 2012 or later
    • YES, 2016 or later, and Win10 or later
  • Windows does not report anything in bootlog (ntbtlog.txt) -- meaning it had not got far enough in booting to start that logging
    • UNKNOWN (I think that file only gets created if it try's to boot to safe mode, which ours aren't)
  • Only hangs on a reboot and only if the VM has been running for a while (ie not on full power off and power back on)
    • YES
  • To resolve you have to hard power off VM and power back on. VMs always boots up fine after the hard power off.
    • YES
  • You are using one of these storage types for the VM: Ceph, ZFS and NFS
    • YES, CEPH
  • All the VMs were built before and upgraded from 6.x to 7.x
    • NO, we have this issue on vms that were built in 7.x as well, and some under 2 months old
  • Have you had any of your VMs do this twice? We have not had the problem more than once with any one VM.
    • YES, we have had this issue on the same vm more than once
Yes, I can confirm exactly every step!
 

itNGO

Active Member
Jun 12, 2020
386
78
28
43
Germany
it-ngo.com
@andrewrf for all the VMs that have hung, are you seeing the following:
  • Only Windows VMs, Server 2012 or later
  • Windows does not report anything in bootlog (ntbtlog.txt) -- meaning it had not got far enough in booting to start that logging
  • Only hangs on a reboot and only if the VM has been running for a while (ie not on full power off and power back on)
  • To resolve you have to hard power off VM and power back on. VMs always boots up fine after the hard power off.
  • You are using one of these storage types for the VM: Ceph, ZFS and NFS
  • All the VMs were built before and upgraded from 6.x to 7.x
Have you had any of your VMs do this twice? We have not had the problem more than once with any one VM.

You may also want to get on the CC for the bug report: https://bugzilla.proxmox.com/show_bug.cgi?id=3933
We have this also sometimes on Ubuntu 20.04 VMs when they do scheduled reboot for updates.
 
  • Like
Reactions: weehooey
Mar 11, 2019
35
5
13
52
Canada
www.weehooey.com
We have this also sometimes on Ubuntu 20.04 VMs when they do scheduled reboot for updates.
@itNGO It is interesting that you are seeing it on Ubuntu 20.04. Not many people are seeing this. Looking back over previous posts, you just get a black screen. What is happening with your VMs seems to be slightly different. I wonder if there is a clue in that.

Would you share some details about the VMs that hang? Perhaps anything that you do that others may not be doing? Anything that might contribute to the different behaviour you are seeing.
 

Huch

Active Member
Mar 28, 2021
153
42
28
Germany
I'm experiencing more than boot problems on existing VMs. Even setting up new VMs is painful slow or doesn't work.

On an AMD Threadripper Pro system with 256GB ECC RAM and 6x Enterprise NVME in ZFS RAID1+0 setting up a simple Debian 11 VM takes 20-30 minutes with the netinstaller. Windows 11 or Windows 10 stuck on the initial UEFI boot screen after "Press any key to boot from DVD..." passed.
 
Mar 11, 2019
35
5
13
52
Canada
www.weehooey.com
I'm experiencing more than boot problems on existing VMs. Even setting up new VMs is painful slow or doesn't work.

On an AMD Threadripper Pro system with 256GB ECC RAM and 6x Enterprise NVME in ZFS RAID1+0 setting up a simple Debian 11 VM takes 20-30 minutes with the netinstaller. Windows 11 or Windows 10 stuck on the initial UEFI boot screen after "Press any key to boot from DVD..." passed.
@Huch this post is about VMs getting stuck when rebooting (mostly Windows). No one else has reported slowness related to this issue. You might have a different issue or two separate problems.
 

itNGO

Active Member
Jun 12, 2020
386
78
28
43
Germany
it-ngo.com
@itNGO It is interesting that you are seeing it on Ubuntu 20.04. Not many people are seeing this. Looking back over previous posts, you just get a black screen. What is happening with your VMs seems to be slightly different. I wonder if there is a clue in that.

Would you share some details about the VMs that hang? Perhaps anything that you do that others may not be doing? Anything that might contribute to the different behaviour you are seeing.
It just the MetricVM with InfluxDB for Proxmox-Metric-Collection.
It does auto-update security patches and then reboot once a week, and guess what... hangs once a week....
1653671693262.png
 

itNGO

Active Member
Jun 12, 2020
386
78
28
43
Germany
it-ngo.com
I'm experiencing more than boot problems on existing VMs. Even setting up new VMs is painful slow or doesn't work.

On an AMD Threadripper Pro system with 256GB ECC RAM and 6x Enterprise NVME in ZFS RAID1+0 setting up a simple Debian 11 VM takes 20-30 minutes with the netinstaller. Windows 11 or Windows 10 stuck on the initial UEFI boot screen after "Press any key to boot from DVD..." passed.
Please open new thread... I believe this is not the same issue discussed here....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!