hi all
earlier today at 9:42am experienced a sudden abrupt shutdown & reboot, unexplained ..
any clues fr anyone?
the hardware is beefy, Dell T710 server grade chassis, Raid6, 2 Hexacores and 24GB of ram, it's always been reliable and ran great uninterrupted with PVE3.2 (4VM's, 2 kvm Win Server and 2 Linux, Ubuntu Server & Debian) since 2014, and since 1-1-2018 it was upgraded to PVE 5.1, clean fresh new install of 5.1 and fresh new Raid controller and new Hitachi HD's. It's been running great until earlier today.
here's the evidence gathered thus far:
pveversion -v
proxmox-ve: 5.1-32 (running kernel: 4.13.13-2-pve)
pve-manager: 5.1-41 (running version: 5.1-41/0b958203)
pve-kernel-4.13.13-2-pve: 4.13.13-32
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-18
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-5
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
The relevant portion extracted out of syslog at time of abrupt shutdown & reboot, at 9:42am
Jan 22 09:33:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:34:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:34:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:35:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:35:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:36:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:36:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/qmgr[1966]: 5E40E440027: from=<>, size=2685, nrcpt=1 (queue active)
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: error: open database /etc/aliases.db: No such file or directory
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: warning: hash:/etc/aliases: lookup of 'root' failed
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: 5E40E440027: to=<root@MainPVE-HypervisorChassis.Dyn.org>, relay=local, delay=25803, delays=25803/0.02/0/0.13, dsn=4.3.0, status=deferred (alias database unavailable)
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: using backwards-compatible default setting relay_domains=$mydestination to update fast-flush logfile for domain "MainPVE-HypervisorChassis
.Dyn.org"
Jan 22 09:37:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:37:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:38:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:38:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:39:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:39:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:40:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:40:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:41:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:41:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:42:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:42:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Jan 22 09:47:19 MainPVE-HypervisorChassis systemd-modules-load[490]: Inserted module 'iscsi_tcp'
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Mounted RPC Pipe File System.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting pNFS block layout mapping daemon...
Jan 22 09:47:19 MainPVE-HypervisorChassis blkmapd[533]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started pNFS block layout mapping daemon.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started Flush Journal to Persistent Storage.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started udev Coldplug all Devices.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting udev Wait for Complete Device Initialization...
Jan 22 09:47:19 MainPVE-HypervisorChassis keyboard-setup.sh[498]: cannot open file /tmp/tmpkbd.sfGy9k
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd-modules-load[490]: Inserted module 'ib_iser'
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started udev Kernel Device Manager.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Mounted NFSD configuration filesystem.
More clues, Earlier in the day, we see this at end of syslog.1:
Jan 22 06:22:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:23:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:23:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:24:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:24:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:25:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:25:01 MainPVE-HypervisorChassis CRON[27979]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
Jan 22 06:25:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:25:02 MainPVE-HypervisorChassis systemd[1]: Reloading PVE API Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[28055]: send HUP to 2176
Jan 22 06:25:04 MainPVE-HypervisorChassis systemd[1]: Reloaded PVE API Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis systemd[1]: Reloading PVE SPICE Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: received signal HUP
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: server closing
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: server shutdown (restart)
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[28066]: send HUP to 2205
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Reloaded PVE SPICE Proxy Server.
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Stopping Proxmox VE firewall logger...
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[8126]: received terminate request (signal)
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[8126]: stopping pvefw logger
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: received signal HUP
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Stopped Proxmox VE firewall logger.
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE firewall logger...
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[28088]: starting pvefw logger
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE firewall logger.
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: server closing
Then in begining of Syslog we have this:
Jan 22 06:25:05 MainPVE-HypervisorChassis liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1328" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: server shutdown (restart)
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: restarting server
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: starting 3 worker(s)
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28120 started
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28121 started
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28122 started
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: restarting server
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: starting 1 worker(s)
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: worker 28124 started
Jan 22 06:25:07 MainPVE-HypervisorChassis liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1328" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jan 22 06:25:11 MainPVE-HypervisorChassis pveproxy[27962]: worker exit
Jan 22 06:25:11 MainPVE-HypervisorChassis pveproxy[30704]: worker exit
Jan 22 06:25:12 MainPVE-HypervisorChassis pveproxy[28836]: worker exit
Jan 22 06:25:13 MainPVE-HypervisorChassis spiceproxy[31519]: worker exit
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 27962 finished
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 28836 finished
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 30704 finished
Jan 22 06:25:17 MainPVE-HypervisorChassis spiceproxy[2205]: worker 31519 finished
Jan 22 06:26:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:26:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:27:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:27:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:28:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:28:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: flushing old values
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: rotating journals
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: started new journal /var/lib/rrdcached/journal/rrd.journal.1516620507.654896
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1516613307.654922
Jan 22 06:28:38 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_02] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Jan 22 06:28:38 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_03] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
Jan 22 06:28:40 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_05] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Jan 22 06:29:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:29:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:30:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:30:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
I did notice that the disks (disk 3 is highest ) are being reported as running 171 deg Celsius (that's gotta be wrong that is as hot as a pizza oven) which is like 340 deg F.. But nothing like that as reported around the time of the abrupt shutdown and reboot..
anyone have any advice , suggestions, guidance..
TY all -pg
LATER on same day:
Got curious about the drive temps being reported by smartd as way too high and i think something is wrong with smartd reporting bcs i just checked syslog again and found the drive temps as follows, being reported by smartd:
Device: /dev/bus/2 [megaraid_disk_02] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 150 to 153
Device: /dev/bus/2 [megaraid_disk_04] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 166
Device: /dev/bus/2 [megaraid_disk_06] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Device: /dev/bus/2 [megaraid_disk_07] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
so i checked the 4 disks being reported here (disks 2, 4, 6 & 7) with
smartctl -x -d megaraid,7 /dev/sdb
(substituting the 7 with 2 , 4 and 6 as needed) and i got well within the specified params of like 34 to 37 degrees C, and at no time did ANY of the raid array Hitachi drives exceed their 60 deg C max specified operating temp according to smartctl stats, they stay within 34-37 deg C as one would expect ..
earlier today at 9:42am experienced a sudden abrupt shutdown & reboot, unexplained ..
any clues fr anyone?
the hardware is beefy, Dell T710 server grade chassis, Raid6, 2 Hexacores and 24GB of ram, it's always been reliable and ran great uninterrupted with PVE3.2 (4VM's, 2 kvm Win Server and 2 Linux, Ubuntu Server & Debian) since 2014, and since 1-1-2018 it was upgraded to PVE 5.1, clean fresh new install of 5.1 and fresh new Raid controller and new Hitachi HD's. It's been running great until earlier today.
here's the evidence gathered thus far:
pveversion -v
proxmox-ve: 5.1-32 (running kernel: 4.13.13-2-pve)
pve-manager: 5.1-41 (running version: 5.1-41/0b958203)
pve-kernel-4.13.13-2-pve: 4.13.13-32
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-18
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-5
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
The relevant portion extracted out of syslog at time of abrupt shutdown & reboot, at 9:42am
Jan 22 09:33:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:34:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:34:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:35:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:35:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:36:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:36:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/qmgr[1966]: 5E40E440027: from=<>, size=2685, nrcpt=1 (queue active)
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: error: open database /etc/aliases.db: No such file or directory
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: warning: hash:/etc/aliases: lookup of 'root' failed
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: 5E40E440027: to=<root@MainPVE-HypervisorChassis.Dyn.org>, relay=local, delay=25803, delays=25803/0.02/0/0.13, dsn=4.3.0, status=deferred (alias database unavailable)
Jan 22 09:36:31 MainPVE-HypervisorChassis postfix/local[14982]: using backwards-compatible default setting relay_domains=$mydestination to update fast-flush logfile for domain "MainPVE-HypervisorChassis
.Dyn.org"
Jan 22 09:37:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:37:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:38:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:38:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:39:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:39:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:40:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:40:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:41:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:41:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 09:42:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 09:42:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Jan 22 09:47:19 MainPVE-HypervisorChassis systemd-modules-load[490]: Inserted module 'iscsi_tcp'
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting Flush Journal to Persistent Storage...
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Mounted RPC Pipe File System.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting pNFS block layout mapping daemon...
Jan 22 09:47:19 MainPVE-HypervisorChassis blkmapd[533]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started pNFS block layout mapping daemon.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started Flush Journal to Persistent Storage.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started udev Coldplug all Devices.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Starting udev Wait for Complete Device Initialization...
Jan 22 09:47:19 MainPVE-HypervisorChassis keyboard-setup.sh[498]: cannot open file /tmp/tmpkbd.sfGy9k
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd-modules-load[490]: Inserted module 'ib_iser'
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Started udev Kernel Device Manager.
Jan 22 09:47:19 MainPVE-HypervisorChassis systemd[1]: Mounted NFSD configuration filesystem.
More clues, Earlier in the day, we see this at end of syslog.1:
Jan 22 06:22:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:23:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:23:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:24:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:24:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:25:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:25:01 MainPVE-HypervisorChassis CRON[27979]: (root) CMD (test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ))
Jan 22 06:25:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:25:02 MainPVE-HypervisorChassis systemd[1]: Reloading PVE API Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[28055]: send HUP to 2176
Jan 22 06:25:04 MainPVE-HypervisorChassis systemd[1]: Reloaded PVE API Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis systemd[1]: Reloading PVE SPICE Proxy Server.
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: received signal HUP
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: server closing
Jan 22 06:25:04 MainPVE-HypervisorChassis pveproxy[2176]: server shutdown (restart)
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[28066]: send HUP to 2205
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Reloaded PVE SPICE Proxy Server.
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Stopping Proxmox VE firewall logger...
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[8126]: received terminate request (signal)
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[8126]: stopping pvefw logger
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: received signal HUP
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Stopped Proxmox VE firewall logger.
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE firewall logger...
Jan 22 06:25:05 MainPVE-HypervisorChassis pvefw-logger[28088]: starting pvefw logger
Jan 22 06:25:05 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE firewall logger.
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: server closing
Then in begining of Syslog we have this:
Jan 22 06:25:05 MainPVE-HypervisorChassis liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1328" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jan 22 06:25:05 MainPVE-HypervisorChassis spiceproxy[2205]: server shutdown (restart)
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: restarting server
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: starting 3 worker(s)
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28120 started
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28121 started
Jan 22 06:25:06 MainPVE-HypervisorChassis pveproxy[2176]: worker 28122 started
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: restarting server
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: starting 1 worker(s)
Jan 22 06:25:07 MainPVE-HypervisorChassis spiceproxy[2205]: worker 28124 started
Jan 22 06:25:07 MainPVE-HypervisorChassis liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1328" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jan 22 06:25:11 MainPVE-HypervisorChassis pveproxy[27962]: worker exit
Jan 22 06:25:11 MainPVE-HypervisorChassis pveproxy[30704]: worker exit
Jan 22 06:25:12 MainPVE-HypervisorChassis pveproxy[28836]: worker exit
Jan 22 06:25:13 MainPVE-HypervisorChassis spiceproxy[31519]: worker exit
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 27962 finished
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 28836 finished
Jan 22 06:25:16 MainPVE-HypervisorChassis pveproxy[2176]: worker 30704 finished
Jan 22 06:25:17 MainPVE-HypervisorChassis spiceproxy[2205]: worker 31519 finished
Jan 22 06:26:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:26:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:27:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:27:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:28:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:28:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: flushing old values
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: rotating journals
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: started new journal /var/lib/rrdcached/journal/rrd.journal.1516620507.654896
Jan 22 06:28:27 MainPVE-HypervisorChassis rrdcached[1978]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1516613307.654922
Jan 22 06:28:38 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_02] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Jan 22 06:28:38 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_03] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
Jan 22 06:28:40 MainPVE-HypervisorChassis smartd[1313]: Device: /dev/bus/2 [megaraid_disk_05] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Jan 22 06:29:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:29:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
Jan 22 06:30:00 MainPVE-HypervisorChassis systemd[1]: Starting Proxmox VE replication runner...
Jan 22 06:30:01 MainPVE-HypervisorChassis systemd[1]: Started Proxmox VE replication runner.
I did notice that the disks (disk 3 is highest ) are being reported as running 171 deg Celsius (that's gotta be wrong that is as hot as a pizza oven) which is like 340 deg F.. But nothing like that as reported around the time of the abrupt shutdown and reboot..
anyone have any advice , suggestions, guidance..
TY all -pg
LATER on same day:
Got curious about the drive temps being reported by smartd as way too high and i think something is wrong with smartd reporting bcs i just checked syslog again and found the drive temps as follows, being reported by smartd:
Device: /dev/bus/2 [megaraid_disk_02] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 150 to 153
Device: /dev/bus/2 [megaraid_disk_04] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 162 to 166
Device: /dev/bus/2 [megaraid_disk_06] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 153 to 157
Device: /dev/bus/2 [megaraid_disk_07] [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 166 to 171
so i checked the 4 disks being reported here (disks 2, 4, 6 & 7) with
smartctl -x -d megaraid,7 /dev/sdb
(substituting the 7 with 2 , 4 and 6 as needed) and i got well within the specified params of like 34 to 37 degrees C, and at no time did ANY of the raid array Hitachi drives exceed their 60 deg C max specified operating temp according to smartctl stats, they stay within 34-37 deg C as one would expect ..
Last edited: