AMD Ryzen 9 5950X keeps rebooting with one specific guest

decibel83

Active Member
Oct 15, 2008
205
1
43
Hi,
I have a Proxmox VE 7.2 cluster with some Intel and some AMD nodes.

I have several virtual machine, one of this is running Windows 2019 Standard with 64 Gb RAM and 8 vCores.

I moved this Windows 2019 VM from one Intel to one AMD node, and from that point the AMD node began to restart itself.
Two days ago it restarted itself for 9 times during the night, even when both the VM and node did not have an high load!

The day after I moved this VM to another AMD node, and it also restarted itself for one time in the evening!

At that point I moved this VM back to the Intel node, and I did not have automatic restart anymore.

So:
  • this Windows VM makes two different AMD servers rebooting themselves
  • The same AMD servers are not rebooting with other virtual machines running on them
  • this Windows VM do not make any problem on Intel servers
I checked the logs on AMD nodes, and I found no logs about the reason why they rebooted, nor in the syslog nor in the dmesg with `sysctl -k`.
It seems like anyone took off the power cable and insert it again!

These are the hardware specifications of the AMD nodes:
  • 32 x AMD Ryzen 9 5950X 16-Core Processor (1 Socket)
  • 128 Gb RAM
  • 2 x 3.84 TB NVMe SSD Datacenter Edition
  • 1 Gbit/sec NIC + additional dual gigabit NIC
  • Linux 5.15.60-1-pve #1 SMP PVE 5.15.60-1 (Mon, 19 Sep 2022 17:53:17 +0200)
  • pve-manager/7.2-11/b76d3178
Code:
root@node05-a:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.60-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-11
pve-kernel-5.15.60-1-pve: 5.15.60-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-3
libpve-guest-common-perl: 4.1-3
libpve-http-server-perl: 4.1-4
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-4
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1

I cannot find any reason why that specific VM makes the AMD nodes rebooting themselves, and not the Intel nodes!

Could you help me to understand what's going on, please?

Thank you very much for your help!
Bye!
 
Thanks! I will try!
Which problems did you have with Windows+AMD?

  • Systems that were setup with Intel first tend to behave strange when being booted on AMD (sudden crashes, memory exceptions etc.). I guess some DLLs/Programs have some Intel specifics
  • Migrating between AMD + Intel fails (even with KVM64). Reboot of VM required afterwards
 
Thanks!

If I understand well, your guests machine were crashing and not the host, is it sure?

Are you be able to migrate VMs between Intel and AMD nodes without reboot after changing to kvm64+aes?
 
Correct - the guest was crashing.

I'm sometimes able to migrate properly, sometimes not, even after switching to kvm64+aes. It's a bit of a lottery :)

As I said - it's just a shot into the dark.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!