Hi everyone
My company is currently using Proxmox to run services, currently has about 9 Nodes with 2 clusters, 1 cluster of 6 nodes and 1 cluster of 3 nodes,
But recently, some nodes automatically rebooted causing the services to be interrupted,
I use console idrac don't see anything logs, they just crashing and i need restart for use.
I mean there was HA but that also hindered because I had to restart using idrac.
This is spec for 1 node
Dell R650SX
112 x Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (2 Sockets)
512 GB RAM,
they run ceph with SSD Sata.
This is log after I reboot using idrac.
IDRAC Event logs did not record any hardware errors.
journalctl --since "2024-12-09 12:15:00" --until "2024-12-09 13:37:00"
Dec 09 13:05:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:06:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:07:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:08:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:09:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:10:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:11:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:12:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:13:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:14:45 pve2 systemd[1]: Starting apt-daily.service - Daily apt download activities...
Dec 09 13:14:45 pve2 systemd[1]: apt-daily.service: Deactivated successfully.
Dec 09 13:14:45 pve2 systemd[1]: Finished apt-daily.service - Daily apt download activities.
Dec 09 13:14:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:15:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:16:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:17:01 pve2 CRON[2138763]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Dec 09 13:17:01 pve2 CRON[2138764]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Dec 09 13:17:01 pve2 CRON[2138763]: pam_unix(cron:session): session closed for user root
Dec 09 13:17:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:18:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:19:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:20:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
-- Boot 328026d952c14b4c8957590e6d638f67 --
Dec 09 13:29:19 pve2 kernel: Linux version 6.8.4-2-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for De>
Dec 09 13:29:19 pve2 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.4-2-pve root=/dev/mapper/pve-root ro quiet
Dec 09 13:29:19 pve2 kernel: KERNEL supported cpus:
Dec 09 13:29:19 pve2 kernel: Intel GenuineIntel
Dec 09 13:29:19 pve2 kernel: AMD AuthenticAMD
Dec 09 13:29:19 pve2 kernel: Hygon HygonGenuine
Dec 09 13:29:19 pve2 kernel: Centaur CentaurHauls
Dec 09 13:29:19 pve2 kernel: zhaoxin Shanghai
Dec 09 13:29:19 pve2 kernel: BIOS-provided physical RAM map:
root@pve2:~# journalctl -p err -f
Dec 09 13:29:21 pve2 smartd[1380]: Device: /dev/bus/6 [megaraid_disk_00] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 09 13:29:21 pve2 smartd[1380]: Device: /dev/bus/6 [megaraid_disk_01] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 09 13:29:24 pve2 pmxcfs[1694]: [quorum] crit: quorum_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [quorum] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [confdb] crit: cmap_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [confdb] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [dcdb] crit: cpg_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [dcdb] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [status] crit: cpg_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [status] crit: can't initialize service
My company is currently using Proxmox to run services, currently has about 9 Nodes with 2 clusters, 1 cluster of 6 nodes and 1 cluster of 3 nodes,
But recently, some nodes automatically rebooted causing the services to be interrupted,
I use console idrac don't see anything logs, they just crashing and i need restart for use.
I mean there was HA but that also hindered because I had to restart using idrac.
This is spec for 1 node
Dell R650SX
112 x Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz (2 Sockets)
512 GB RAM,
they run ceph with SSD Sata.
This is log after I reboot using idrac.
IDRAC Event logs did not record any hardware errors.
journalctl --since "2024-12-09 12:15:00" --until "2024-12-09 13:37:00"
Dec 09 13:05:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:06:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:07:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:08:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:09:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:10:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:11:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:12:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:13:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:14:45 pve2 systemd[1]: Starting apt-daily.service - Daily apt download activities...
Dec 09 13:14:45 pve2 systemd[1]: apt-daily.service: Deactivated successfully.
Dec 09 13:14:45 pve2 systemd[1]: Finished apt-daily.service - Daily apt download activities.
Dec 09 13:14:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:15:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:16:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:17:01 pve2 CRON[2138763]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Dec 09 13:17:01 pve2 CRON[2138764]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Dec 09 13:17:01 pve2 CRON[2138763]: pam_unix(cron:session): session closed for user root
Dec 09 13:17:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:18:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:19:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
Dec 09 13:20:47 pve2 snmpd[514757]: systemstats_linux: unexpected header length in /proc/net/snmp. 237 != 224
-- Boot 328026d952c14b4c8957590e6d638f67 --
Dec 09 13:29:19 pve2 kernel: Linux version 6.8.4-2-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for De>
Dec 09 13:29:19 pve2 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.4-2-pve root=/dev/mapper/pve-root ro quiet
Dec 09 13:29:19 pve2 kernel: KERNEL supported cpus:
Dec 09 13:29:19 pve2 kernel: Intel GenuineIntel
Dec 09 13:29:19 pve2 kernel: AMD AuthenticAMD
Dec 09 13:29:19 pve2 kernel: Hygon HygonGenuine
Dec 09 13:29:19 pve2 kernel: Centaur CentaurHauls
Dec 09 13:29:19 pve2 kernel: zhaoxin Shanghai
Dec 09 13:29:19 pve2 kernel: BIOS-provided physical RAM map:
root@pve2:~# journalctl -p err -f
Dec 09 13:29:21 pve2 smartd[1380]: Device: /dev/bus/6 [megaraid_disk_00] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 09 13:29:21 pve2 smartd[1380]: Device: /dev/bus/6 [megaraid_disk_01] [SAT], no ATA CHECK POWER STATUS support, ignoring -n Directive
Dec 09 13:29:24 pve2 pmxcfs[1694]: [quorum] crit: quorum_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [quorum] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [confdb] crit: cmap_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [confdb] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [dcdb] crit: cpg_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [dcdb] crit: can't initialize service
Dec 09 13:29:24 pve2 pmxcfs[1694]: [status] crit: cpg_initialize failed: 2
Dec 09 13:29:24 pve2 pmxcfs[1694]: [status] crit: can't initialize service