PROXMOX randomly reboots

solace

New Member
Dec 31, 2020
3
1
1
45
I have a Proxmox cluster with two physical servers, pve and pve2. They are identical Dell R710s with 96GB of memory and 1TB (RAID-10). For some reason which I have yet to identify, pve2 will power cycle. I have check HW logs through iDRAC and there are no alarms or errors.

I am fairly new to Proxmox so I don't know where to look for errors logs outside of the usual Linux places like syslog and dmesg.
Any suggestions on what It may be or where I could check?

Here is a snippet of my syslog when the reboot happened: (@ Dec 30 16:54:01)


Code:
Dec 30 16:48:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:49:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:49:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:49:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:50:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:50:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:50:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:51:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:51:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:51:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:52:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:52:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:52:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:53:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:53:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:53:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:54:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Dec 30 16:54:01 pve2 systemd[1]: pvesr.service: Succeeded.
Dec 30 16:54:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Dec 30 16:57:42 pve2 dmeventd[492]: dmeventd ready for processing.
Dec 30 16:57:42 pve2 kernel: [    0.000000] Linux version 5.4.73-1-pve (build@pve) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) ()
Dec 30 16:57:42 pve2 kernel: [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.73-1-pve root=/dev/mapper/pve-root ro quiet
Dec 30 16:57:42 pve2 kernel: [    0.000000] KERNEL supported cpus:
Dec 30 16:57:42 pve2 systemd-modules-load[483]: Inserted module 'iscsi_tcp'
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Intel GenuineIntel
Dec 30 16:57:42 pve2 kernel: [    0.000000]   AMD AuthenticAMD
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Hygon HygonGenuine
Dec 30 16:57:42 pve2 kernel: [    0.000000]   Centaur CentaurHauls
Dec 30 16:57:42 pve2 kernel: [    0.000000]   zhaoxin   Shanghai
Dec 30 16:57:42 pve2 systemd[1]: Starting Flush Journal to Persistent Storage...
Dec 30 16:57:42 pve2 kernel: [    0.000000] x86/fpu: x87 FPU will use FXSAVE
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-provided physical RAM map:
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bf378fff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf379000-0x00000000bf38efff] reserved
Dec 30 16:57:42 pve2 systemd[1]: Started udev Coldplug all Devices.
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf38f000-0x00000000bf3cdfff] ACPI data
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000bf3ce000-0x00000000bfffffff] reserved
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Dec 30 16:57:42 pve2 systemd[1]: Starting Helper to synchronize boot up for ifupdown...
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000ffffffff] reserved
Dec 30 16:57:42 pve2 kernel: [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000183fffffff] usable
Dec 30 16:57:42 pve2 kernel: [    0.000000] NX (Execute Disable) protection: active
Dec 30 16:57:42 pve2 kernel: [    0.000000] SMBIOS 2.6 present.
Dec 30 16:57:42 pve2 kernel: [    0.000000] DMI: Dell Inc. PowerEdge R710/0Y7JM4, BIOS 6.3.0 07/24/2012
Dec 30 16:57:42 pve2 systemd[1]: Starting udev Wait for Complete Device Initialization...
Dec 30 16:57:42 pve2 kernel: [    0.000000] tsc: Fast TSC calibration using PIT
Dec 30 16:57:42 pve2 kernel: [    0.000000] tsc: Detected 2393.902 MHz processor
Dec 30 16:57:42 pve2 kernel: [    0.003268] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
 
Last edited:
Hi,
Dec 30 16:57:42 pve2 kernel: [ 0.000000] Linux version 5.4.73-1-pve (build@pve) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP PVE 5.4.73-1 (Mon, 16 Nov 2020 10:52:16 +0100) ()

the current Proxmox kernel is 5.4.78-2-pve so please consider to upgrade to latest version of Proxmox VE then reboot the server and please post the output of pveversion -v
 
Whenever I upgrade, the response I get is:
Your system is up-to-date

Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
 
Kernel is upgraded, but reboots are still happening, Where else should I look?

Code:
root@pve2:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
 
  • Like
Reactions: Hyacin