node unexpectedly shut-down evrytime I run Windows VM

ricobelo

New Member
Feb 21, 2025
1
0
1
France
Hello,

I run a home-lab consisting of 3 mini-PCs (Ryzen 5825U with 64GB of RAM, 1TB NVME, 500GB SSD each node). Running Proxmox VE 8.3 cluster with Ceph on the 1TB NVME and LVM Thin on de 500 GB SSD.
This cluster is almost 2 years old now, and is mainly running CTs, plus very few VMs

Since last two weeks I started to POP a lot of Windows Server 2025 (core and Desktop editions) VMs for test some setups :
I first created to VMs to deploy Core and Desktop with virtio and spice support, windows updates and so on, then I "OOBE" sysprep'ed the VMs and converted info VM templates, then I started to pop VMs, distributing them as equally as possible on my 3 nodes :
  • 3 Windows Core (2 vcpu, 2GB RAM, 80GB storage)
  • 5 Windows Desktop (4 vcpu, 4GB RAM, 80GB storage)

The resources pressure is about
  • less than 10% for CPU
  • less than 30% memory
  • about 35% for storage

It all worked without issues, until yesterday, when I noticed some VMs were not joinable anymore... and I discovered that actually the node shut-down !
I restated it, seen no error during boot (no storage recovery or anything that could suggest a crash), I checked the logs without seeing any signs of OOM issue. Even worse : I seeng in journald signs that the node processed a normal shut-down !

Now the weird thing : every time I start one of the Windows VM on this specific node, the node shutdown within a random delay of 5 to 30 minutes ???
Since yesterday, I spend my time on restarting this f####g node :mad:
  • if I move the VM on another node, the problem disappears ! on both the two other nodes !
  • I tried to delete and recreate a new Windows VM from scratch, the node still reboots !
  • I have other CTs and Linux VM running on this node, with no problem : it stays rock stable unless I start a Windows node...

Here is the VM config of one of my VM:
root@hvm3:~# cat /etc/pve/nodes/hvm3/qemu-server/130.conf

YAML:
#Windows Server 2025 en-us Desktop Edition
agent: 1
bios: ovmf
boot: order=scsi0;ide0
cores: 4
cpu: host
efidisk0: local-hvm3:vm-130-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
ide0: none,media=cdrom
machine: pc-q35-9.0
memory: 4096
meta: creation-qemu=9.0.2,ctime=1739467692
name: ad-rds-srv2
net0: virtio=BC:24:11:E8:28:16,bridge=vmbrvlan
numa: 0
ostype: win11
scsi0: local-hvm3:vm-130-disk-1,cache=writeback,discard=on,iothread=1,size=80G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=d8f0281e-6729-43eb-8454-f1e7258835a7
sockets: 1
tpmstate0: local-hvm3:vm-130-disk-2,size=4M,version=v2.0
vga: qxl
vmgenid: 15d01cd2-4fae-4505-b6bd-8b5e047e9a99

This is making me crazy and I have no idea of what's going on !

Any idea ?

Regards,

Eric