Unexpected reboot in the production server

Soporte G2K

New Member
Apr 28, 2023
1
0
1
Hi,
I'm having problems with my server. Unexpectedly it reboots and there is no log
The server specifications are:
CPU(s) 16 x Intel(R) Core(TM) i7-10700F CPU @ 2.90GHz (1 Socket)
kernel version Linux 5.15.74-1-pve #1 SMP PVE 5.15.74-1 (Mon, Nov 14 2022 20:17:15 +0100) PVE Manager Version pve-manager/7.3-3/c3928077


I leave you some logs from Tuesday, 25 and Thursday, April 27, which are the days that the server was restarted.

25 pr
Apr 25 19:35:15 kernel: traps: systemd-udevd[1906355] general protection fault ip:7fff70ff56b5 sp:7fff70fddb30 error:0
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: update cluster info (cluster name rh-dedicados, version = 7)
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: node has quorum
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: members: 1/1639, 2/1643, 3/2785226, 4/3211, 5/2203, 6/3755293, 7/1906356
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: starting data syncronisation
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: members: 1/1639, 2/1643, 3/2785226, 4/3211, 5/2203, 6/3755293, 7/1906356
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: starting data syncronisation
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: received sync request (epoch 1/1639/00000387)
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: received sync request (epoch 1/1639/0000029A)
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: received all states
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: leader is 1/1639
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: synced members: 1/1639, 2/1643, 3/2785226, 4/3211, 5/2203, 6/3755293
Apr 25 19:35:15 pmxcfs[1906356]: [dcdb] notice: waiting for updates from leader
Apr 25 19:35:15 pmxc ded75 pmxcfs[1906356]: [dcdb] notice: all data is up to date
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: received all states
Apr 25 19:35:15 pmxcfs[1906356]: [status] notice: all data is up to date
Apr 25 19:35:16 pve-ha-lrm[2146]: updating service status from manager failed: Connection refused
Apr 25 19:35:16 systemd[1]: Started The Proxmox VE cluster filesystem.
Apr 25 19:36:13 ksmtuned[1403]: /usr/sbin/ksmtuned: line 123: 1908224 Segmentation fault sleep $KSM_MONITOR_INTERVAL
Apr 25 19:36:13 kernel: traps: sleep[1908224] general protection fault ip:7f8483f25df4 sp:7ffdc77ff2a0 error:0 in ld-2.31.so[7f8483f25000+20000]
Apr 25 19:36:27 pve-firewall[1946]: status update error: command 'iptables-save' failed: got signal 11
Apr 25 19:36:27 kernel: traps: iptables-save[1908702] general protection fault ip:7f7bb4da9df4 sp:7ffeb7f8b720 error:0 in ld-2.31.so[7f7bb4da9000+20000]
Apr 25 19:36:33 pmxcfs[1906356]: [status] notice: received log
Apr 25 19:37:43 pvestatd[1945]: command 'lxc-info -n 118 -p' failed: got signal 11
Apr 25 19:37:43 kernel: traps: lxc-info[1911072] general protection fault ip:7fd74b20ddf4 sp:7fffaa054d70 error:0 in ld-2.31.so[7fd74b20d000+20000]
-- Reboot --
Apr 25 19:38:21 ded75 kernel: Linux version 5.15.74-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.74-1 (Mon, 14 Nov 2022 20:17:15 +0100) ()
Apr 25 19:38:21 kernel: Command line: initrd=\EFI\proxmox\5.15.74-1-pve\initrd.img-5.15.74-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs
Apr 25 19:38:21 kernel: KERNEL supported cpus:
Apr 25 19:38:21 kernel: Intel GenuineIntel
Apr 25 19:38:21 kernel: AMD AuthenticAMD
Apr 25 19:38:21 kernel: Hygon HygonGenuine
Apr 25 19:38:21 kernel: Centaur CentaurHauls
Apr 25 19:38:21 kernel: zhaoxin Shanghai
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
Apr 25 19:38:21 kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
Apr 25 19:38:21 kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
Apr 25 19:38:21 kernel: x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
Apr 25 19:38:21 kernel: x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
Apr 25 19:38:21 kernel: x86/fpu: xstate_offset[9]: 960, xstate_sizes[9]: 8
Apr 25 19:38:21 kernel: x86/fpu: Enabled xstate features 0x21f, context size is 968 bytes, using 'compacted' format.

----------------
27 apr
pr 27 17:33:49 pmxcfs[2133]: [status] notice: received log
Apr 27 17:38:44 pmxcfs[2133]: [dcdb] notice: data verification successful
Apr 27 17:49:49 pmxcfs[2133]: [status] notice: received log
Apr 27 18:04:50 pmxcfs[2133]: [status] notice: received log
Apr 27 18:17:01 CRON[967863]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 27 18:17:01 CRON[967864]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Apr 27 18:17:01 CRON[967863]: pam_unix(cron:session): session closed for user root
Apr 27 18:20:49 pmxcfs[2133]: [status] notice: received log
-- Reboot --
Apr 27 18:33:49 ded75 kernel: Linux version 5.15.74-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.74-1 (Mon, 14 Nov 2022 20:17:15 +0100) ()
Apr 27 18:33:49 kernel: Command line: initrd=\EFI\proxmox\5.15.74-1-pve\initrd.img-5.15.74-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs
Apr 27 18:33:49 kernel: KERNEL supported cpus:
Apr 27 18:33:49 kernel: Intel GenuineIntel
Apr 27 18:33:49 kernel: AMD AuthenticAMD
Apr 27 18:33:49 kernel: Hygon HygonGenuine
Apr 27 18:33:49 kernel: Centaur CentaurHauls
Apr 27 18:33:49 kernel: zhaoxin Shanghai
Apr 27 18:33:49 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Apr 27 18:33:49 kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'

I have already checked the crons and it does not have any.
The strange thing is that it happens to me with the version Linux 5.15.74-1-pve #1 SMP PVE 5.15.74-1 because I have other servers with proxmox Linux 5.13.19-2 and they do not restart

thanks for your help
 
Code:
Apr 25 19:35:15 kernel: traps: systemd-udevd[1906355] general protection fault ip:7fff70ff56b5 sp:7fff70fddb30 error:0

this and similar messages could indicate some issue with hardware (memory, corrupted files on disk). I would suggest checking those first (smart, debsums, memtest)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!