Host system freezes in some hours after start and MiniPC overheats

saneklad

Member
Nov 25, 2018
2
0
21
32
I use Proxmox quite long with mipiPC as home Pfsense router and some VMs.
But in last two days my system completely hang in some hours after power on and aluminum case overheats terribly.
When I make hard reset - everything works fine again, but the problem occurs again after a couple of hours.

I would be grateful for help where to look for the problem and how to fix it.

j3UvVgT.jpg

VRDFgFl.jpg

Code:
Filesystem            Size  Used Avail Use% Mounted on
udev                  7.8G     0  7.8G   0% /dev
tmpfs                 1.6G  1.3M  1.6G   1% /run
/dev/mapper/pve-root   55G   20G   36G  36% /
tmpfs                 7.8G   40M  7.8G   1% /dev/shm
tmpfs                 5.0M     0  5.0M   0% /run/lock
/dev/sda2             511M  312K  511M   1% /boot/efi
/dev/fuse             128M   16K  128M   1% /etc/pve
tmpfs                 1.6G     0  1.6G   0% /run/user/0
Code:
proxmox-ve: 7.0-2 (running kernel: 5.11.22-7-pve)
pve-manager: 7.0-14+1 (running version: 7.0-14+1/08975a4c)
pve-kernel-helper: 7.1-4
pve-kernel-5.11: 7.0-10
pve-kernel-5.4: 6.4-6
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.4.140-1-pve: 5.4.140-1
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-12
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-3
libpve-storage-perl: 7.0-13
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.13-1
proxmox-backup-file-restore: 2.0.13-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.1-1
pve-docs: 7.0-5
pve-edk2-firmware: 3.20210831-1
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.1.0-1
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-18
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3
 
I disassembled my QOTOM Mini PC and find out this mSATA Hoodisk 256GB SSD installed

Specification says 544 TBW for 512GB
I looked into my SMART parameters:

Power on: 2 yr 10 mon. 1 w 3 days
Lifetime Writes: 885035 GiB (omg)
SSD Life Left: 92%
170 Bad_Blk_Ct_Erl/Lat: 1 / 156 (do not understand)

So the main problem is old aged ssd and big TBW?
But it works without problems from new power cycle and Temperature is 33 Celsious.
I think I need to add daemon for SMART change and logging to syslog that can help me investigate the problem.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!