[HELP] Windows Server 2016 VM (migrated from VMware) - CRITICAL_PROCESS_DIED after restore from PBS and Veeam

Piertonio

Member
Mar 14, 2024
4
0
6
Background / VM history:


The affected VM has a complex migration history:


  1. Originally a physical Windows Server 2016 server (HP ML350 Gen10, Xeon Silver) virtualized using VMware Converter onto a Dell PowerEdge with E5-2660 V4 CPUs
  2. Subsequently migrated to Proxmox VE using the built-in Proxmox migration tool (import from ESXi)
  3. Running on a 3-node Proxmox cluster with shared iSCSI storage (LVM over iSCSI on a NAS)

Environment:


  • Proxmox VE: 8.4.17
  • PBS: 4.1.6
  • Veeam Backup & Replication: v13
  • VM role: Windows Server 2016 Domain Controller + File Server
  • VM disk: ~559 GB LVM volume on iSCSI NAS
  • Backup targets: PBS (local ZFS datastore) + Veeam (SMB NAS target)

Problem:


Following a crash, we attempted to restore the VM from both PBS and Veeam backups. All restore attempts (multiple PBS snapshots from different dates, plus Veeam backup) result in the same behavior:


  1. VM boots, CHKDSK runs automatically and fixes some filesystem errors
  2. On reboot after CHKDSK, Windows crashes with CRITICAL_PROCESS_DIED BSOD
  3. System uptime at crash: approximately 4 seconds

Diagnostic steps performed:


We analyzed the memory dump with WinDbg. The crash analysis shows:




CRITICAL_PROCESS_DIED
CriticalProcessDied.Process: wininit.exe
FAILURE_BUCKET_ID: 0xEF_wininit.exe_BUGCHECK_CRITICAL_PROCESS_489dd740_ntdll!NtTerminateProcess

wininit.exe is calling NtTerminateProcess on itself — it's not a driver crash, it's an intentional exit. The boot log (ntbtlog.txt) only shows 5 entries before the crash:
ntoskrnl.exe
hal.dll
mcupdate_GenuineIntel.dll
werkernel.sys

The crash occurs before any third-party drivers are loaded.


Attempted fixes (all unsuccessful):


  • Disabled all VMware/ESXi residual services and drivers via offline registry editing (vmxnet3ndis6, vmbus, vmci, vm3dmp, vmmouse, VMMemCtl, VMSP, VMSVSP, etc.)
  • Changed disk controller from VirtIO SCSI to SATA — resulted in INACCESSIBLE_BOOT_DEVICE instead
  • Enabled Safe Mode boot via bcdedit — same CRITICAL_PROCESS_DIED in Safe Mode
  • Changed CPU type from host to kvm64 — no change
  • Attempted to disable Hyper-V enlightenments via -cpu args — no change
  • Disabled mcupdate_GenuineIntel service via offline registry — no change
  • Attempted DISM /RestoreHealth from Windows Server 2016 ISO — failed (version mismatch, error 0x800f081f)
  • Attempted SFC /scannow offline — found corrupted files but could not repair them

Key observation:


The dump shows Hyper-V enlightenments are active (SynicAvailable=1, ApicEnlightened=1), which is inherited from the VMware/Hyper-V migration history. However disabling these did not resolve the issue.


The fact that the crash occurs after only 5 kernel files load, even in Safe Mode, suggests corruption at the HAL or kernel level rather than a driver issue.


Question:


Has anyone experienced this specific scenario with a VM that went through a physical → VMware → Proxmox migration chain? Is there a known fix for wininit.exe terminating this early in the boot process, or is a clean reinstall with AD restore (from ntds.dit) the only viable path?


Any help appreciated.
 
Last edited:
I am no Windows expert, really, but it seems that the content of the backups is broken. Try some really old ones to check if you have one without corruption.
Aside of that I would try one of these options:
1) another repair attempt
Attempted DISM /RestoreHealth from Windows Server 2016 ISO — failed (version mismatch, error 0x800f081f)
You might need to slipstream patches into the iso or attach the disk to a same version vm to make this work.

2) Mount the disk on another VM and pull the data of the file server out. This was hopefully not the only DC in your AD, so you can drop this vm, make a new one and integrate it as new DC from there.
 
  • Like
Reactions: cwt
As @fba mentioned: setup a new VM, mount the DC‘s drive to it and copy all files you need. An AD restore is the only real option.
 
install the qemu tools. the drivers are missing

in best case - BEFORE you move it.
 
Last edited:
install the qemu tools. the drivers are missing

in best case - BEFORE you move it.
That would not cause this error. Typical for missing drivers is for example INACCESSIBLE_BOOT_DEVICE. Winit is one of the core start processes which initiales services and the userland. It looks like core components are damaged (system hive, smss chain, etc.).
 
  • Like
Reactions: fba
does your pysically server still reboots successfully ?

btw:

The CRITICAL_PROCESS_DIED (stop code 0x000000EF) error on Windows 10/11 indicates a core system component—such as a driver or system file—has failed, causing a blue screen (BSOD) to protect data. Common causes include corrupted system files, faulty drivers, or hardware issues. Key fixes include running SFC/DISM scans, updating drivers, checking for updates, or using System Restore.
Common Fixes for CRITICAL_PROCESS_DIED:
  • Run System File Checker (SFC) and DISM: Open Command Prompt as administrator and run:
    • sfc /scannow
    • dism /online /cleanup-image /restorehealth
    • Update or Roll Back Drivers: Open Device Manager to check for malfunctioning drivers, especially after a recent update.
    • Run Disk Check (CHKDSK): Run chkdsk /f /r in Command Prompt to scan for and fix hard drive errors.
    • Uninstall Recent Updates: If the error began after a Windows update, uninstall the latest quality or feature update.
    • Perform a Clean Boot: Start Windows with minimum drivers and programs to identify software conflicts.
    • Perform a Clean Installation/System Restore: If the issue persists, restore to a previous working state or reinstall Windows.
Potential Causes:
  • Driver Failure: A faulty driver, particularly after a Windows update.
  • Corrupted System Files: Critical files necessary for Windows operation are damaged.
  • Hardware Failure: Malfunctioning RAM or a dying hard drive.
  • Malware: Malicious software causing critical system processes to fail.
Immediate Troubleshooting Steps:
  1. Restart: Sometimes a one-time error, so a reboot may solve it.
  2. Unplug Peripherals: Disconnect external devices (printers, USB drives) to rule out driver issues with those devices.
  3. Use Safe Mode: If you cannot boot normally, enter Safe Mode to perform troubleshooting.
 
Last edited: