-- Reboot -- proxmox 8.1.8 6.5.13-3-pve

Budas1

New Member
Mar 11, 2024
6
1
3
Hello i have
Dell r740xd 2x xeon 8260M PERC H740P
24x 64GB (1X64GB) 4DRX4 PC4-2400T DDR4 MEMORY
2xIntel Optane 512GB 288pin DDR4-2666 DCPMM PC4 Persistent Memory
2x7.68TB Samsung PM1733 U.2 NVMe PCIe 4.0 SSD Drive MZ-WLJ7T 8TB 2.5"
4xSamsung PM1725b 1.6TB PCIe 3.0 x4 NVMe U.2 SAS 2.5" SSD MZ-WLL1T6B MZWLL1T6HAJQ
2x01YGFW 1YGFW DELL POWEREDGE NVME PCIE EXPANDER CONTROL ADAPTER (0TJCNG , 0YN9K8)


my server rebooted ramdomly I can't find anything in the logs about the cause of the reboot no hardware malfunction

Does anyone know how to fix this? Could it be due to the kernel? Thanks


Code:
Linux aycomkz 6.5.13-3-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-3 (2024-03-20T10:45Z) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Apr  1 08:07:32 +05 2024 on pts/0
root@aycomkz:~# pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.13-3-pve)
pve-manager: 8.1.8 (running version: 8.1.8/d29041d9f87575d0)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5.13-3-pve-signed: 6.5.13-3
proxmox-kernel-6.5: 6.5.13-3
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
ceph-fuse: 17.2.7-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.3
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.5
libpve-cluster-perl: 8.0.5
libpve-common-perl: 8.1.1
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.6
libpve-network-perl: 0.9.6
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.1.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.5-1
proxmox-backup-file-restore: 3.1.5-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.5
proxmox-widget-toolkit: 4.1.5
pve-cluster: 8.0.5
pve-container: 5.0.9
pve-docs: 8.1.5
pve-edk2-firmware: 4.2023.08-4
pve-firewall: 5.0.3
pve-firmware: 3.9-2
pve-ha-manager: 4.0.3
pve-i18n: 3.2.1
pve-qemu-kvm: 8.1.5-4
pve-xtermjs: 5.3.0-3
qemu-server: 8.1.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve1


Apr 01 00:17:01 Serverx CRON[194922]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 00:17:01 Serverx CRON[194923]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 00:17:01 Serverx CRON[194922]: pam_unix(cron:session): session closed for user root
Apr 01 00:24:01 Serverx CRON[196654]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 00:24:01 Serverx CRON[196655]: (root) CMD (if [ $(date +%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/trim ]; then /usr/lib/zfs-linux/trim; fi)
Apr 01 00:24:01 Serverx CRON[196654]: pam_unix(cron:session): session closed for user root
Apr 01 00:25:43 Serverx smartd[5312]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 79 to 80
Apr 01 01:13:50 Serverx systemd[1]: Starting fstrim.service - Discard unused blocks on filesystems from /etc/fstab...
Apr 01 01:13:55 Serverx fstrim[210710]: /boot/efi: 1010.3 MiB (1059414016 bytes) trimmed on /dev/sda2
Apr 01 01:13:55 Serverx fstrim[210710]: /: 74.6 GiB (80119312384 bytes) trimmed on /dev/pve/root
Apr 01 01:13:55 Serverx systemd[1]: fstrim.service: Deactivated successfully.
Apr 01 01:13:55 Serverx systemd[1]: Finished fstrim.service - Discard unused blocks on filesystems from /etc/fstab.
Apr 01 01:17:01 Serverx CRON[211438]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 01:17:01 Serverx CRON[211439]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 01:17:01 Serverx CRON[211438]: pam_unix(cron:session): session closed for user root
Apr 01 02:17:01 Serverx CRON[226132]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 02:17:01 Serverx CRON[226133]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 02:17:01 Serverx CRON[226132]: pam_unix(cron:session): session closed for user root
Apr 01 03:10:01 Serverx CRON[239418]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 03:10:01 Serverx CRON[239419]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Apr 01 03:10:01 Serverx CRON[239418]: pam_unix(cron:session): session closed for user root
Apr 01 03:17:01 Serverx CRON[240991]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 03:17:01 Serverx CRON[240992]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 03:17:01 Serverx CRON[240991]: pam_unix(cron:session): session closed for user root
Apr 01 04:17:01 Serverx CRON[256130]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 04:17:01 Serverx CRON[256131]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 04:17:01 Serverx CRON[256130]: pam_unix(cron:session): session closed for user root
Apr 01 04:55:43 Serverx smartd[5312]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 80 to 81
Apr 01 05:17:01 Serverx CRON[271675]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 05:17:01 Serverx CRON[271676]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 05:17:01 Serverx CRON[271675]: pam_unix(cron:session): session closed for user root
Apr 01 06:17:01 Serverx CRON[286421]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 06:17:01 Serverx CRON[286422]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 06:17:01 Serverx CRON[286421]: pam_unix(cron:session): session closed for user root
Apr 01 06:25:01 Serverx CRON[288394]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 06:25:01 Serverx CRON[288395]: (root) CMD (test -x /usr/sbin/anacron || { cd / && run-parts --report /etc/cron.daily; })
Apr 01 06:25:01 Serverx CRON[288394]: pam_unix(cron:session): session closed for user root
Apr 01 06:39:23 Serverx IPCC.xs[6195]: pam_unix(proxmox-ve-auth:auth): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=::ffff:192.168.77.50 user=root
Apr 01 06:39:25 Serverx pvedaemon[6195]: authentication failure; rhost=::ffff:192.168.77.50 user=root@pam msg=Authentication failure
Apr 01 06:39:55 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 06:52:01 Serverx CRON[295117]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 06:52:01 Serverx CRON[295118]: (root) CMD (test -x /usr/sbin/anacron || { cd / && run-parts --report /etc/cron.monthly; })
Apr 01 06:52:01 Serverx CRON[295117]: pam_unix(cron:session): session closed for user root
Apr 01 06:54:23 Serverx pvedaemon[6195]: <root@pam> successful auth for user 'root@pam'
Apr 01 07:09:23 Serverx pvedaemon[6197]: <root@pam> successful auth for user 'root@pam'
Apr 01 07:17:01 Serverx CRON[301374]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 07:17:01 Serverx CRON[301375]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 07:17:01 Serverx CRON[301374]: pam_unix(cron:session): session closed for user root
Apr 01 07:24:23 Serverx pvedaemon[6197]: <root@pam> successful auth for user 'root@pam'
Apr 01 07:33:52 Serverx pveproxy[183497]: worker exit
Apr 01 07:33:52 Serverx pveproxy[112164]: worker 183497 finished
Apr 01 07:33:52 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 07:33:52 Serverx pveproxy[112164]: worker 305749 started
Apr 01 07:34:32 Serverx pveproxy[183496]: worker exit
Apr 01 07:34:32 Serverx pveproxy[112164]: worker 183496 finished
Apr 01 07:34:32 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 07:34:32 Serverx pveproxy[112164]: worker 305932 started
Apr 01 07:38:20 Serverx pveproxy[183495]: worker exit
Apr 01 07:38:20 Serverx pveproxy[112164]: worker 183495 finished
Apr 01 07:38:20 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 07:38:20 Serverx pveproxy[112164]: worker 306873 started
Apr 01 07:39:23 Serverx pvedaemon[6195]: <root@pam> successful auth for user 'root@pam'
Apr 01 07:54:23 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 08:07:32 Serverx systemd[1]: session-38.scope: Deactivated successfully.
Apr 01 08:07:32 Serverx systemd[1]: session-38.scope: Consumed 2.296s CPU time.
Apr 01 08:07:32 Serverx systemd-logind[5315]: Session 38 logged out. Waiting for processes to exit.
Apr 01 08:07:32 Serverx systemd-logind[5315]: Removed session 38.
Apr 01 08:07:32 Serverx pvedaemon[6195]: <root@pam> end task UPID:Serverx:0001D140:001EDAD7:660981F5:vncshell::root@pam: OK
Apr 01 08:07:32 Serverx pvedaemon[314238]: starting termproxy UPID:Serverx:0004CB7E:005E86AE:660A24F4:vncshell::root@pam:
Apr 01 08:07:32 Serverx pvedaemon[6197]: <root@pam> starting task UPID:Serverx:0004CB7E:005E86AE:660A24F4:vncshell::root@pam:
Apr 01 08:07:32 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 08:07:32 Serverx login[314242]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
Apr 01 08:07:32 Serverx systemd-logind[5315]: New session 55 of user root.
Apr 01 08:07:32 Serverx systemd[1]: Started session-55.scope - Session 55 of User root.
Apr 01 08:07:32 Serverx login[314247]: ROOT LOGIN on '/dev/pts/0'
Apr 01 08:07:32 Serverx pveproxy[144194]: worker exit
Apr 01 08:09:23 Serverx pvedaemon[6197]: <root@pam> successful auth for user 'root@pam'
Apr 01 08:17:01 Serverx CRON[317417]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 08:17:01 Serverx CRON[317418]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 08:17:01 Serverx CRON[317417]: pam_unix(cron:session): session closed for user root
Apr 01 08:24:23 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 08:25:43 Serverx smartd[5312]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 81 to 80
Apr 01 08:30:08 Serverx pveproxy[305749]: worker exit
Apr 01 08:30:08 Serverx pveproxy[112164]: worker 305749 finished
Apr 01 08:30:08 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 08:30:08 Serverx pveproxy[112164]: worker 321506 started
Apr 01 08:39:23 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 08:39:57 Serverx pveproxy[112164]: worker 306873 finished
Apr 01 08:39:57 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 08:39:57 Serverx pveproxy[112164]: worker 324337 started
Apr 01 08:40:00 Serverx pveproxy[324336]: got inotify poll request in wrong process - disabling inotify
Apr 01 08:44:00 Serverx pveproxy[305932]: worker exit
Apr 01 08:44:00 Serverx pveproxy[112164]: worker 305932 finished
Apr 01 08:44:00 Serverx pveproxy[112164]: starting 1 worker(s)
Apr 01 08:44:00 Serverx pveproxy[112164]: worker 325524 started
Apr 01 08:54:23 Serverx pvedaemon[6196]: <root@pam> successful auth for user 'root@pam'
Apr 01 09:09:23 Serverx pvedaemon[6195]: <root@pam> successful auth for user 'root@pam'
Apr 01 09:17:01 Serverx CRON[338323]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 01 09:17:01 Serverx CRON[338324]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 01 09:17:01 Serverx CRON[338323]: pam_unix(cron:session): session closed for user root
-- Reboot --
Apr 01 09:22:36 Serverx kernel: Linux version 6.5.13-3-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-3 (2024-03-20T10:45Z) ()
Apr 01 09:22:36 Serverx kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.13-3-pve root=/dev/mapper/pve-root ro quiet
Apr 01 09:22:36 Serverx kernel: KERNEL supported cpus:
Apr 01 09:22:36 Serverx kernel: Intel GenuineIntel
Apr 01 09:22:36 Serverx kernel: AMD AuthenticAMD
Apr 01 09:22:36 Serverx kernel: Hygon HygonGenuine
Apr 01 09:22:36 Serverx kernel: Centaur CentaurHauls
Apr 01 09:22:36 Serverx kernel: zhaoxin Shanghai
Apr 01 09:22:36 Serverx kernel: BIOS-provided physical RAM map:
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000682fefff] usable
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x00000000682ff000-0x00000000683fefff] reserved
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x00000000683ff000-0x0000000068bfefff] type 20
Apr 01 09:22:36 Serverx kernel: BIOS-e820: [mem 0x0000000068bff000-0x000000006ebfefff] reserved
 
Does it reboot when no load/VMs are running? What idrac/BIOS version are you running?

I have 1x R740xd out of 24 R740xd's that has this same behavior.
 
thanks for answer Does it reboot when no load/VMs are running? not checked

BIOS Version2.20.1
iDRAC Firmware Version7.00.00.171