Proxmox Host Crashes When Windows VM Runs Simulations – Need Help Configuring High-Performance Setup

andresar29

New Member
Jun 12, 2025
1
0
1
Hi all,


I’ve been running a Proxmox server set up for heavy simulations. The idea is simple: I only run either the Windows or the Linux VM (never both at once, I am using a hookscript), and I want them to use as much CPU and RAM as possible. There’s also a TrueNAS VM running permanently to provide shared storage to both.


The issue is with the Windows VM. Whenever I start a simulation, at some point during execution the entire server becomes unreachable — no web UI, no SSH, I can’t even ping it. I’ve had to go to the server room to hard reset it multiple times.


System Overview​


  • Proxmox VE: 6.8.12-9
  • CPU: AMD Ryzen Threadripper 7980X (64 cores / 128 threads)
  • RAM: 512 GB
  • Boot disk: 1TB Samsung 990 PRO (ZFS)
  • Shared disk: 500 GB partition from same SSD, exported via NFS
  • Swap: 16 GB file-based

VM Setup​


Windows VM
  • 400 GB RAM (ballooning disabled)
  • 56 cores, 1 socket
  • CPU: host
  • GPU passthrough enabled
  • Main disk on local-zfs

Linux VM

  • Not running at the same time as Windows
  • Also intended for heavy simulations with similar resource assignment

TrueNAS VM


  • 16 GB RAM
  • Disk stored in a rpool to avoid ZFS-on-ZFS issues
  • Always running for NFS shared storage

What’s Happening​


Everything works fine at first. Then as soon as the Windows VM starts doing serious work (a simulation), the whole host becomes unreachable. It’s not a VM crash — it’s the entire Proxmox node.


I've already:


  • Disabled ballooning
  • Checked for OOM kills or PCI errors in dmesg and journalctl (nothing obvious)
  • Added swap
  • Verified ZFS is not using too much ARC memory (I checked ARC stats)

Still, nothing helps. The only pattern is that it happens when Windows starts a heavy simulation.


What I’d Appreciate Help With​


  • Is 56 cores + 400 GB too much? Should I reserve more for the host?
  • Is there a better way to configure the Windows VM for this use case?
  • Could GPU passthrough be causing instability even if it works at first?
  • Are there known issues with high resource assignments in Proxmox 8.x?
  • Would switching from local-zfs to file-based storage help?
 
https://search.brave.com/search?q=R...summary=1&conversation=f05b9be52604de7163281a

Make sure you have latest BIOS and try running ' top ' or ' htop ' on the physical console so you can see what's going on.

ZFS boot/root on the 990 pro may be an issue, make sure you have the latest firmware for that SSD as well and maybe try reinstalling PVE to ext4/LVM (MAKE SURE YOU HAVE A BACKUP OF EVERYTHING 1st)

Allocating a lot of cores to a single VM is not something I've done yet with proxmox, you may need to experiment with cores/sockets and NUMA

https://search.brave.com/search?q=R...summary=1&conversation=a8f43b2e1fd79976a79272