Q: Aug-2025 new problem - Win10/11 VM Boot Drama / BSOD / Unbootable

fortechitsolutions

Renowned Member
Jun 4, 2008
474
61
93
Hi, this is a bit open ended but I wanted to put a post in the forum in case anyone else bumps into this.

I had my second proxmox server this week give me a similar bit of unhappy drama, which is very unusual. My solution in both cases ended up being "nuke the VM, restore from last good backup, reboot normally, stiff upper lip smile and carry on" more or less. But it is not great solution since the underlying cause is a bit of a mystery and I'm a little bit worried I'll be revisited by the drama again, either on these hosts or elsewhere. (Ie, I've got a number of other clients also on similar proxmox boxes with similar windows VMs running on them).

Context-detail >
first box is running Proxmox 8.4.1 / OVH platform hardware / has been in service >1 year and smooth-solid operation
the VM here in question is a Windows.11.Pro instance used as a kind of term-server-app-server thing. Vanilla base OS as much as possible -mostly- although it does have "PrimoCache" disk<>ram cache software installed / which was put in place ~5 months ago / to boost performance a bit / and that has been very smooth and solid (ie, if it was going to blow up I think it has been many months of opportunity to do so).

second box was running proxmox 8.3.4 / OVH platform hardware / has been in service ~>3 years and smooth solid operation. This one is also doing similar role - app server term server thing - more or less. It is even more vanilla, has no extra bits and bobs like PrimoCache, and has very low use by the client site team / is probably slated for imminent retirement anyhow in the next few months. But anyhoo.

The fact I had one hit of this drama, I was "humm" - the second now makes me go, "um, urg" with greater caution/concern.

Looking on the history of WIndows updates on both VM - I am slightly concerned there was some update released on Tues.19 possibly from microsoft that maybe a factor. This is just my brain weaving a 'just so maybe' story out of the timeline. Both hosts did have good windows updates happen on Aug.12 without drama.

First server outage was reported on the morning of Tues.19th Aug. when people showed up for work / it was operating normally End-of-day on Aug.18
Second one was reported sad/outage state, yesterday Aug.21st, but I think this is first time anyone on this small team attempted to use the server for ~1 week or more.

Basic symptoms I have observed in both cases,
FIRST one
-- host was not responding normally - typically people connect via RDP - it behaved as if server was 'offline'
-- getting console on proxmox, I could see the windows login screen OK, and I could move the mouse around and see the windows console mouse tracking my movement properly. But it ignored my attempt to login / Ctrl-Alt-Del to force login prompt. Not able to interact with the console.
-- kick the host with a graceful shutdown > no response. Note both VM have the standard full bundle of VirtIO tools installed, which includes of course the VIrtIO hard drive driver, the VIrtioNIC driver, the balloon memory driver, as well as QemuGuestAgent etc. version of the ISO / VIrtio Version is - ISO Named - 0.1.240 - so not the latest / has not been updated since I installed the VM (Unless these drivers inside windows get updated via WinUpdate, I am not sure that is the case or not)
-- So, not able to get a graceful shut down, so do a non-graceful STOP on the VM, and it drops.
-- power on the VM > it tries to boot normally and quickly dumps to a BSOD with error "StopCode - Bad System Config Info" - is unable to boot
-- then reboots, does windows self-repair diagnostic attempt > fails > cannot boot > I am able to interact with advanced debug fun
-- in this one, initially an attempt to look at disks status in the VM - with "diskpart" - in a dosbox cmd session - it was not clear windows was seeing the VirtIO attached drives properly, which is odd. Attempt to hop into the (normally C:) drive gave an error - "Device not ready"
-- booted LinuxSystemRescueCD and from there was able to repair and mount the NTFS C Drive virtual disk volume and see the proper content of the main windows c-drive-volume ; rebooted - windows recovery environment still cannot see the normally-C-drive properly.
-- power off VM, flipped drives over to IDE not VIrtIO, rebooted into recovery-drama-mode, take a look again, can now see drives and content.
-- spent a while (30+min) banging my head on rebuild boot/BCD / rebuild the EFI boot / it continued to tell me 'detected windows instances: zero" - no progress. Spinning wheels.
-- at this point just gave up on repair attempt - deleted the VM - restored from backup taken on the night of Aug.18 > once restore was done it booted up perfectly first try.
-- it has been online since morning of Wed.Aug.20 / so that is 2 days ago now / the host has been online, and smooth sailing so far, no sign of any more windows update sneaking in - last windows update shown is Aug.12 still

SECOND one
-- same basic initial problem report - cannot connect via remote desktop to the host - get open the proxmox console, can see login screen, mouse movement tracks properly, but no response to login on the console attempt
-- ignores the graceful shutdown request, hard stop to power down VM, power up via start > Dumps straight to a 'attempting repairs" on Windows boot screen, didn't even visibly give a BSOD message.
-- attempted to do initial fix in windows - 'advanced' mode after failed self fix - did not work (see snip below) - gave up at this point.
-- note this VM, unlike the first, I DID have ability to see my windows-normally-C: drive without having to fuss in VMhardwareConfig and flip from VirtIO to IDE - so that was different 'flow'. Anyhoo.
-- in this one I was basically lazy, in a hurry, and client was not fussed about how old a backup I went back to / since host wasn't used this week at all. And a busy day, no time to sink an hour or more on debug WindowsDrama, so I just deleted the borked VM and restored from backup - initially picked backup that was made the night of Aug.19
-- rebooted the VM once restore was done, and it failed to boot > Dumped directly to the "fix disk error windows boot up sad screen" again. arrgh.
-- dumped this no-good version, restored from backup version from night of Aug.18 > once that was done > booted normally and happy days the little monster is online now.
-- for this one, VirtIO drivers etc installed from ISO - version/named "VirtIO 0.1.215" - has not been updated manually by me since system install.
-- so far it 'is just running smoothly' but it hasn't been online all that long. (less than an hour)

Anyhow. I am kind-of-sort of worried there is a fresh Aug.19.2025? windows update that might bork virtio based windows VM / possibly maybe if the VirtIO driver is not latest-greatest. Or something like that? But I am not really certain. But I figure if I put a note in the forum, there is a chance if anyone else is seeing similar sad behaviour on Windows VMs it could be potentially useful to help identify (hey I am not only person with a sad weird VM problem today) and (maybe help ensure we can track the smoking gun in the next while potentially and maybe uncover what is going on).

Obviously in this case I'm rather glad (understatement) to have PBS Backups active and working smoothly, so it is easy enough to roll back to an old-recent point in time good backup copy of the VM that was borked. But. It is a bit of a drama, outage delay, disruption, and a source of some concern - the first-bigger term server is used quite heavily by that one client of mine, and I am hopeful to not have more repeated outages like this going forward 'just because'.

Sigh. Great fun.

Happy Friday.

-Tim

----second host - windows manual attempt repair boot brief failed drama snip ref-----
 

Attachments

  • boot repair manual attempt note SECOND server.png
    boot repair manual attempt note SECOND server.png
    129.8 KB · Views: 7
Last edited: