[SOLVED] Proxmox host reboots after starting VM

mapt

Member
Aug 21, 2021
6
0
6
37
Hello, i am having an issue where the proxmox host is physically rebooting a few minutes after starting an windows VM. The issue started after i added more RAM to the system, it came with 32GB and i added another 128GB to a total of 160GB ( i run memtest after installing the new RAM and it passed with no errors). The VM in question is a Windows VM with a Tesla P4 passed through, everthing was working fine before installing the new RAM and also made sure to not overcommit RAM. I am new to proxmox and homelabing so here is some hardware's specs and the output from journalctl and pveversion. Any help would be greatly appreciated.

Dell R730
CPU: E5-2630 V3 (2X)
RAM: 160GB ECC DDR4 RDIM
GPU: Tesla P4 (passed through to VM)

Bash:
pveversion -v

proxmox-ve: 8.1.0 (running kernel: 6.5.11-6-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.5: 6.5.11-6
proxmox-kernel-6.5.11-6-pve-signed: 6.5.11-6
proxmox-kernel-6.5.11-5-pve-signed: 6.5.11-5
proxmox-kernel-6.5.11-4-pve-signed: 6.5.11-4
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve4
 

Attachments

Last edited:
Hello,

May you provide us with more syslog output? you can generate the syslog using journalctl from until i.e., from you start the VM until the PVE server got restarted:

Bash:
journalctl --since "2023-12-11 12:00" --until "2023-12-11 15:00" > $(hostname)-Syslog.txt
 
Hello,

May you provide us with more syslog output? you can generate the syslog using journalctl from until i.e., from you start the VM until the PVE server got restarted:

Bash:
journalctl --since "2023-12-11 12:00" --until "2023-12-11 15:00" > $(hostname)-Syslog.txt
Hi Moayad,
The syslog output from the moment i started the VM to the point the host crached is attached to the original post. I have been doing some test and i noticed that even i was not overcommitting the RAM ( left 6gb free for the host) that apparently was not enough and when the VM started the total memory usage was hovering over 99% and probably causing the crash when a spike happened. I reduced the RAM allocated to the VM and the crash stopped. Is there a min % of ram that i need to reserve to the proxmox host? I did not have this problem when i have 32gb of total ram and reserved only 4gb to the host. Thank you all for the support!
 
Have you checked the iDRAC to see whether hardware errors were detected or whether the power supplies were too weak and therefore shut down?
 
Have you checked the iDRAC to see whether hardware errors were detected or whether the power supplies were too weak and therefore shut down?
Yes, that was my first thought but the power draw was way under the threshold and there were no spikes.
 
So this happens each and every time immediately on starting the VM? or just randomly now?
It was happening each and every time before where I had "only" 4gb of the total 157gb of RAM unallocated and available to the Proxmox host. I had to reduce the allocated RAM to the VMs (without ballooning) so now I have 9gb of the total 157gb of unallocated memory available to the Proxmox host (see attached image) and the crashing issue is completely gone. That makes me wonder if there is minimum unallocated percentage of the total system memory that has to be available to the host. I did not have any problem when the total system RAM was 32gb and i had 4gb of unallocated memory available to the host.
 

Attachments

  • vms.png
    vms.png
    40.1 KB · Views: 14

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!