Proxmox semi-random Crashes

mannmi

New Member
Dec 28, 2021
1
0
1
27
Hello *,

My proxmox keeps crashing semi-randomly.
I have been able to reproduce some of the triggers though. When ever I open a link in my windows 10 vm. (https://education.github.com/pack). My proxmox host/vm will crash on first or second load (mostly instantly though). => back to Bios post.
As far as i can tell the proxmox does not crash when no vm is running (but this could be a false conclusion)

Are there any issues that may be related?
I did have issues with the RTL8125 not showing up in my interfaces but i somehow solved it. (it just started working).
I did have an issue with the system time settings but i think i fixed it.
Memory usage is incorrectly stated in the vm Summary.

To give you some context.
System Migration:
AM4 => AM5​
proxmox-ve 7 => proxmox-ve 8​


You can find the additional Information below... In case i forgot any information :) i will be glad to provide the additional information. :)

Technical Details:
Mobo : x670e pro wifi => (installed bios version/no newer version) PRIME X670-P WIFI BIOS 1654
CPU : AMD Ryzen 9 7900X
GPU : Amd Radeon 5700 XT (Navi 10) => blacklisted and for the windows vm
SSD : (brand/type) Samsung SSD 860

Ram => MemTest86 => passed
SSD (where proxmox is installed on) => health test => passed (command used to test sudo smartctl -t long -a /dev/sdc)
fwupdmgr update => No upgradable devices

C-state Bios => disalbed => AMD CBS => Global C-state Control => disabled
Virtualisation => iommu etz. Enabled

/etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="textonly pcie_acs_override=downstream,multifunction"
GRUB_CMDLINE_LINUX=""

# If your computer has multiple operating systems installed, then you
# probably want to run os-prober. However, if your computer is a host
# for guest OSes installed via LVM or raw disk devices, running
# os-prober can cause damage to those guest OSes as it mounts
# filesystems to look for things.
#GRUB_DISABLE_OS_PROBER=false

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"
cat /etc/network/interfaces
auto lo
iface lo inet loopback

#auto eno1
iface eno1 inet manual
#iface enxa0cec8cd2d04 inet manual
#iface eno1 inet

auto vmbr0
iface vmbr0 inet static
address 192.168.178.2/24
gateway 192.168.178.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-15-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
pve-kernel-6.1: 7.3-6
pve-kernel-5.4: 6.4-20
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
proxmox-kernel-6.2: 6.2.16-15
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
pve-kernel-6.1.15-1-pve: 6.1.15-1
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.4.203-1-pve: 5.4.203-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.26-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.3-1
proxmox-backup-file-restore: 3.0.3-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-6
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1

uname -a
Linux pve 6.2.16-15-pve #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-15 (2023-09-28T13:53Z) x86_64 GNU/Linux

journalctl -p err -b -1
Okt 09 14:10:16 pve kernel: SVM: kvm [1928]: vcpu1, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:16 pve kernel: SVM: kvm [1928]: vcpu2, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:16 pve kernel: SVM: kvm [1928]: vcpu3, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:16 pve kernel: SVM: kvm [1928]: vcpu4, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:17 pve kernel: SVM: kvm [1928]: vcpu5, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:17 pve kernel: SVM: kvm [1928]: vcpu6, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:17 pve kernel: SVM: kvm [1928]: vcpu7, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:17 pve kernel: SVM: kvm [1928]: vcpu8, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:17 pve kernel: SVM: kvm [1928]: vcpu9, guest rIP: 0xfffff84be5efc171 unimplemented wrmsr: 0xc0010115 data 0x0
Okt 09 14:10:27 pve pvedaemon[1789]: VM 103 qmp command failed - VM 103 qmp command 'guest-ping' failed - got timeout



Thank you in advanced for any help i may resive.
Hope it doesn't just turn out that i forgot something mundane :)

Kind Regards,
mannmi
 
Last edited:
Good afternoon,

The same thing is happening to me, I have an HP Prodesk 405 G with the AMD RYZEN 4 processor and when I start a VM with HomeAssistantOS, after a while the Proxmox server restarts without any explanation.

I've been reviewing it all day and I'm desperate.

I would appreciate your help. I have Proxmox 8.0.44
RAM TEST PASSED
CPU TEST PASSED
NMVE TEST PASSED

CTs work normally.

Thank you so much
 
I have a cluster of 3 physically identical nodes.
HP Elite Mini 800 G9 Desktop PC
12th Gen Intel Core i7-12700T
NVIDIA GA107M [GeForce RTX 3050 Ti Mobile]

One of the 3 is randomly crashing overnight. After looking at pveversion, it looks like the one that is crashing was updated while the other 2 were not updated (not sure how I managed to do that, but I must have). The crashing one is using proxmox-kernel-6.2.16-15-pve.

proxmox-ve: 8.0.2 (running kernel: 6.2.16-14-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
proxmox-kernel-6.2: 6.2.16-14
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx4
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.4
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-6
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

proxmox-ve: 8.0.2 (running kernel: 6.2.16-15-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
proxmox-kernel-6.2: 6.2.16-15
proxmox-kernel-6.2.16-14-pve: 6.2.16-14
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx5
libjs-extjs: 7.0.0-4
libknet1: 1.26-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.3-1
proxmox-backup-file-restore: 3.0.3-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-6
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1

1c1
< proxmox-ve: 8.0.2 (running kernel: 6.2.16-14-pve)
---
> proxmox-ve: 8.0.2 (running kernel: 6.2.16-15-pve)
4a5,6
> proxmox-kernel-6.2.16-15-pve: 6.2.16-15
> proxmox-kernel-6.2: 6.2.16-15
6d7
< proxmox-kernel-6.2: 6.2.16-14
12c13
< ifupdown2: 3.2.0-1+pmx4
---
> ifupdown2: 3.2.0-1+pmx5
14c15
< libknet1: 1.25-pve1
---
> libknet1: 1.26-pve1
30,31c31,32
< proxmox-backup-client: 3.0.2-1
< proxmox-backup-file-restore: 3.0.2-1
---
> proxmox-backup-client: 3.0.3-1
> proxmox-backup-file-restore: 3.0.3-1
36c37
< proxmox-widget-toolkit: 4.0.6
---
> proxmox-widget-toolkit: 4.0.9
39c40
< pve-docs: 8.0.4
---
> pve-docs: 8.0.5
52c53
< zfsutils-linux: 2.1.12-pve1
---
> zfsutils-linux: 2.1.13-pve1


Before updating the other two, I thought I would search around online to see if anyone had any issues with that kernel. Searching for "6.2.16-15-pve crash" on google comes up with this thread as well as this linux.org thread.

For now, I'm going to pin the kernel version to 6.2.16-14.
proxmox-boot-tool kernel pin 6.2.16-14-pve
proxmox-boot-tool refresh

I'm going to update the other nodes and try to get their packages in sync while still using the 6.2.16-14 kernel.
 
After pinning the kernels for my cluster to 6.2.16-14-pve, and updating them all to have the same package versions, the 3rd node is has not crashed in a few days.

I am not 100% sure which action resolved the issue, getting off kernel 6.2.16-15-pve, or updating all the nodes to have the same package versions. Nonetheless, I will continue to pin the kernels until the next one comes along.
 
I spoke too soon. My crashing is still occurring on 6.2.16-14-pve. It's probably unrelated to your issue. Sorry to pollute your thread with unnecessary information.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!