Hello,
First of all, I'm not expert.
I setup my Proxmox VE about year and a half ago on one HP workstation (tower PC) and have two VM's and three LXC's. I have daily backups setup (during the night) and Metric server to InfluxDB (one of LXC containers). VE version is 7.4-17.
The problem is that very randomly VE hang up and it's not accessible anymore (no ping, no ssh, no giu). I run Homeassistant and MQTT as two separated VM's and that mean all my home automations not working that time. I need to force shutdown machine, power up and then everything boot up normally. I'm checking syslog, but can't find the reason why VE environment stop responding. Last time happen today, please find few last syslog lines below (force manual shutdown and startup was done on Nov 21 06:57:52).
Nov 20 23:34:56 b38prox smartd[789]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 70
Nov 21 00:00:59 b38prox systemd[1]: Starting Rotate log files...
Nov 21 00:00:59 b38prox systemd[1]: Starting Daily man-db regeneration...
Nov 21 00:00:59 b38prox systemd[1]: Reloading PVE API Proxy Server.
Nov 21 00:01:02 b38prox systemd[1]: man-db.service: Succeeded.
Nov 21 00:01:02 b38prox systemd[1]: Finished Daily man-db regeneration.
Nov 21 00:01:03 b38prox pveproxy[1208869]: send HUP to 1164
Nov 21 00:01:03 b38prox pveproxy[1164]: received signal HUP
Nov 21 00:01:03 b38prox systemd[1]: Reloaded PVE API Proxy Server.
Nov 21 00:01:03 b38prox pveproxy[1164]: server closing
Nov 21 00:01:03 b38prox pveproxy[1164]: server shutdown (restart)
Nov 21 00:01:03 b38prox systemd[1]: Reloading PVE SPICE Proxy Server.
Nov 21 00:01:03 b38prox spiceproxy[1208906]: send HUP to 1170
Nov 21 00:01:03 b38prox systemd[1]: Reloaded PVE SPICE Proxy Server.
Nov 21 00:01:03 b38prox spiceproxy[1170]: received signal HUP
Nov 21 00:01:03 b38prox spiceproxy[1170]: server closing
Nov 21 00:01:03 b38prox spiceproxy[1170]: server shutdown (restart)
Nov 21 00:01:03 b38prox systemd[1]: Stopping Proxmox VE firewall logger...
Nov 21 00:01:03 b38prox pvefw-logger[564713]: received terminate request (signal)
Nov 21 00:01:03 b38prox pvefw-logger[564713]: stopping pvefw logger
Nov 21 00:01:03 b38prox systemd[1]: pvefw-logger.service: Succeeded.
Nov 21 00:01:03 b38prox systemd[1]: Stopped Proxmox VE firewall logger.
Nov 21 00:01:03 b38prox systemd[1]: pvefw-logger.service: Consumed 7.066s CPU time.
Nov 21 00:01:04 b38prox systemd[1]: Starting Proxmox VE firewall logger...
Nov 21 00:01:04 b38prox systemd[1]: Started Proxmox VE firewall logger.
Nov 21 00:01:04 b38prox pvefw-logger[1208916]: starting pvefw logger
Nov 21 00:01:04 b38prox systemd[1]: logrotate.service: Succeeded.
Nov 21 00:01:04 b38prox systemd[1]: Finished Rotate log files.
Nov 21 00:01:04 b38prox spiceproxy[1170]: restarting server
Nov 21 00:01:04 b38prox spiceproxy[1170]: starting 1 worker(s)
Nov 21 00:01:04 b38prox spiceproxy[1170]: worker 1208920 started
Nov 21 00:01:05 b38prox pveproxy[1164]: restarting server
Nov 21 00:01:05 b38prox pveproxy[1164]: starting 3 worker(s)
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208941 started
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208942 started
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208943 started
Nov 21 00:01:09 b38prox spiceproxy[564717]: worker exit
Nov 21 00:01:09 b38prox spiceproxy[1170]: worker 564717 finished
Nov 21 00:01:10 b38prox pveproxy[564720]: worker exit
Nov 21 00:01:10 b38prox pveproxy[564719]: worker exit
Nov 21 00:01:10 b38prox pveproxy[564718]: worker exit
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564720 finished
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564719 finished
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564718 finished
Nov 21 00:17:01 b38prox CRON[1215974]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 21 00:17:01 b38prox CRON[1215975]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Nov 21 00:17:01 b38prox CRON[1215974]: pam_unix(cron:session): session closed for user root
Nov 21 00:34:56 b38prox smartd[789]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 70 to 69
-- Reboot --
Nov 21 06:57:52 b38prox kernel: Linux version 5.15.126-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.126-1 (2023-10-03T17:24Z) ()
Nov 21 06:57:52 b38prox kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.126-1-pve root=/dev/mapper/pve-root ro quiet
Nov 21 06:57:52 b38prox kernel: KERNEL supported cpus:
Nov 21 06:57:52 b38prox kernel: Intel GenuineIntel
Nov 21 06:57:52 b38prox kernel: AMD AuthenticAMD
Nov 21 06:57:52 b38prox kernel: Hygon HygonGenuine
Nov 21 06:57:52 b38prox kernel: Centaur CentaurHauls
Nov 21 06:57:52 b38prox kernel: zhaoxin Shanghai
.
.
.
Can anyone help me find the reason why this happen? Could be SSD where VM is running maybe?
Thank you and best regards,
Johnny
First of all, I'm not expert.
I setup my Proxmox VE about year and a half ago on one HP workstation (tower PC) and have two VM's and three LXC's. I have daily backups setup (during the night) and Metric server to InfluxDB (one of LXC containers). VE version is 7.4-17.
The problem is that very randomly VE hang up and it's not accessible anymore (no ping, no ssh, no giu). I run Homeassistant and MQTT as two separated VM's and that mean all my home automations not working that time. I need to force shutdown machine, power up and then everything boot up normally. I'm checking syslog, but can't find the reason why VE environment stop responding. Last time happen today, please find few last syslog lines below (force manual shutdown and startup was done on Nov 21 06:57:52).
Nov 20 23:34:56 b38prox smartd[789]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 69 to 70
Nov 21 00:00:59 b38prox systemd[1]: Starting Rotate log files...
Nov 21 00:00:59 b38prox systemd[1]: Starting Daily man-db regeneration...
Nov 21 00:00:59 b38prox systemd[1]: Reloading PVE API Proxy Server.
Nov 21 00:01:02 b38prox systemd[1]: man-db.service: Succeeded.
Nov 21 00:01:02 b38prox systemd[1]: Finished Daily man-db regeneration.
Nov 21 00:01:03 b38prox pveproxy[1208869]: send HUP to 1164
Nov 21 00:01:03 b38prox pveproxy[1164]: received signal HUP
Nov 21 00:01:03 b38prox systemd[1]: Reloaded PVE API Proxy Server.
Nov 21 00:01:03 b38prox pveproxy[1164]: server closing
Nov 21 00:01:03 b38prox pveproxy[1164]: server shutdown (restart)
Nov 21 00:01:03 b38prox systemd[1]: Reloading PVE SPICE Proxy Server.
Nov 21 00:01:03 b38prox spiceproxy[1208906]: send HUP to 1170
Nov 21 00:01:03 b38prox systemd[1]: Reloaded PVE SPICE Proxy Server.
Nov 21 00:01:03 b38prox spiceproxy[1170]: received signal HUP
Nov 21 00:01:03 b38prox spiceproxy[1170]: server closing
Nov 21 00:01:03 b38prox spiceproxy[1170]: server shutdown (restart)
Nov 21 00:01:03 b38prox systemd[1]: Stopping Proxmox VE firewall logger...
Nov 21 00:01:03 b38prox pvefw-logger[564713]: received terminate request (signal)
Nov 21 00:01:03 b38prox pvefw-logger[564713]: stopping pvefw logger
Nov 21 00:01:03 b38prox systemd[1]: pvefw-logger.service: Succeeded.
Nov 21 00:01:03 b38prox systemd[1]: Stopped Proxmox VE firewall logger.
Nov 21 00:01:03 b38prox systemd[1]: pvefw-logger.service: Consumed 7.066s CPU time.
Nov 21 00:01:04 b38prox systemd[1]: Starting Proxmox VE firewall logger...
Nov 21 00:01:04 b38prox systemd[1]: Started Proxmox VE firewall logger.
Nov 21 00:01:04 b38prox pvefw-logger[1208916]: starting pvefw logger
Nov 21 00:01:04 b38prox systemd[1]: logrotate.service: Succeeded.
Nov 21 00:01:04 b38prox systemd[1]: Finished Rotate log files.
Nov 21 00:01:04 b38prox spiceproxy[1170]: restarting server
Nov 21 00:01:04 b38prox spiceproxy[1170]: starting 1 worker(s)
Nov 21 00:01:04 b38prox spiceproxy[1170]: worker 1208920 started
Nov 21 00:01:05 b38prox pveproxy[1164]: restarting server
Nov 21 00:01:05 b38prox pveproxy[1164]: starting 3 worker(s)
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208941 started
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208942 started
Nov 21 00:01:05 b38prox pveproxy[1164]: worker 1208943 started
Nov 21 00:01:09 b38prox spiceproxy[564717]: worker exit
Nov 21 00:01:09 b38prox spiceproxy[1170]: worker 564717 finished
Nov 21 00:01:10 b38prox pveproxy[564720]: worker exit
Nov 21 00:01:10 b38prox pveproxy[564719]: worker exit
Nov 21 00:01:10 b38prox pveproxy[564718]: worker exit
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564720 finished
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564719 finished
Nov 21 00:01:11 b38prox pveproxy[1164]: worker 564718 finished
Nov 21 00:17:01 b38prox CRON[1215974]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Nov 21 00:17:01 b38prox CRON[1215975]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Nov 21 00:17:01 b38prox CRON[1215974]: pam_unix(cron:session): session closed for user root
Nov 21 00:34:56 b38prox smartd[789]: Device: /dev/sdc [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 70 to 69
-- Reboot --
Nov 21 06:57:52 b38prox kernel: Linux version 5.15.126-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15.126-1 (2023-10-03T17:24Z) ()
Nov 21 06:57:52 b38prox kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.126-1-pve root=/dev/mapper/pve-root ro quiet
Nov 21 06:57:52 b38prox kernel: KERNEL supported cpus:
Nov 21 06:57:52 b38prox kernel: Intel GenuineIntel
Nov 21 06:57:52 b38prox kernel: AMD AuthenticAMD
Nov 21 06:57:52 b38prox kernel: Hygon HygonGenuine
Nov 21 06:57:52 b38prox kernel: Centaur CentaurHauls
Nov 21 06:57:52 b38prox kernel: zhaoxin Shanghai
.
.
.
Can anyone help me find the reason why this happen? Could be SSD where VM is running maybe?
Thank you and best regards,
Johnny