pvestatd.service crash every 24 hours

Anime4000 · Mar 8, 2023

Hi, this is my first time on post on this forum, however, I encounter repeating error and crash, at first, I just simply restart the service every 14 days, but now it's too frequent

Code:

● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
     Active: failed (Result: signal) since Tue 2023-03-07 05:41:04 +08; 1 day 16h ago
    Process: 4067838 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 4067840 (code=killed, signal=SEGV)
        CPU: 6min 46.916s

Mar 06 22:23:34 hitoha systemd[1]: Starting PVE Status Daemon...
Mar 06 22:23:34 hitoha pvestatd[4067840]: starting server
Mar 06 22:23:34 hitoha systemd[1]: Started PVE Status Daemon.
Mar 07 05:41:04 hitoha systemd[1]: pvestatd.service: Main process exited, code=killed, status=11/SEGV
Mar 07 05:41:04 hitoha systemd[1]: pvestatd.service: Failed with result 'signal'.
Mar 07 05:41:04 hitoha systemd[1]: pvestatd.service: Consumed 6min 46.916s CPU time.

I have been looking around the net for solution, it appear restarting the service is not the solution,

I installed on NVMe disk, and run filesystem check and the NVMe disk is healthy and no bad sector

I wondering what cause pvestatd.service to crash at specific time?

Lukas Wagner · Mar 8, 2023

Hello, could you please post the output of pveversion -v?

Anime4000 · Mar 8, 2023

Lukas Wagner said:
Hello, could you please post the output of pveversion -v?

Here

Code:

root@hitoha:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
pve-manager: 7.3-6 (running version: 7.3-6/723bb6ec)
pve-kernel-helper: 7.3-5
pve-kernel-5.15: 7.3-2
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-2
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.5.5
pve-cluster: 7.3-2
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-3
pve-ha-manager: 3.5.1
pve-i18n: 2.8-3
pve-qemu-kvm: 7.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1
root@hitoha:~# uptime
 03:00:26 up 13 days,  9:22,  2 users,  load average: 1.60, 1.10, 0.94

Anime4000 · Mar 9, 2023

Today, the service is killed again, where I can troubleshoot or log? VM and LXC are running fine

scan disk:

Code:

root@hitoha:~# badblocks -sv /dev/nvme0n1
Checking blocks 0 to 488386583
Checking for bad blocks (read-only test): done                                                 
Pass completed, 0 bad blocks found. (0/0/0 errors)

Code:

● pvestatd.service - PVE Status Daemon
     Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
     Active: failed (Result: signal) since Thu 2023-03-09 18:42:58 +08; 4h 48min ago
    Process: 354198 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
   Main PID: 354199 (code=killed, signal=SEGV)
        CPU: 16min 51.246s

Mar 08 22:19:17 hitoha systemd[1]: Starting PVE Status Daemon...
Mar 08 22:19:17 hitoha pvestatd[354199]: starting server
Mar 08 22:19:17 hitoha systemd[1]: Started PVE Status Daemon.
Mar 09 18:42:58 hitoha systemd[1]: pvestatd.service: Main process exited, code=killed, status=11/SEGV
Mar 09 18:42:58 hitoha systemd[1]: pvestatd.service: Failed with result 'signal'.
Mar 09 18:42:58 hitoha systemd[1]: pvestatd.service: Consumed 16min 51.246s CPU time.

if I do cron job every 24-hours, it should fix it right? but I losing the stats log

spirit · Mar 9, 2023

nothing in #dmesg or /var/log/kern.log ?

Anime4000 · Mar 10, 2023

spirit said:
nothing in #dmesg or /var/log/kern.log ?

only have this

Code:

root@hitoha:~# cat /var/log/kern.log | grep pvestatd
Mar  7 05:41:04 hitoha kernel: [993804.442368] pvestatd[4067840]: segfault at 2008 ip 0000564fc6e87f90 sp 00007ffebb019f30 error 4 in perl[564fc6dd0000+185000]

fweber · Mar 10, 2023

Hello, there is an older post with the same symptoms where the culprit turned out to be faulty RAM [1]. Could you try running a memtest?

[1] https://forum.proxmox.com/threads/pvestatd-segfault.109875/#post-473165

Anime4000 · Mar 12, 2023

Memory is fine, I check on other PC,
Downgrading Firmware fix the problem.

Apparently Firmware that I update contain "Mitigate the AMD potential security vulnerabilities for AMD Athlon™ processors and Ryzen™ processors" cause instability.

yamanipanuchi · May 20, 2023

Interesting, I also have this same issue. Seems almost daily I find the service in a "Failed" state. Unfortunately for me, My system is an Intel. I will double check my memory even though it seems that was not your issue.

The system works great otherwise.

Anyone else have any deeper idea's?

pvestatd.service crash every 24 hours

Anime4000

Member

Lukas Wagner

Proxmox Staff Member

Anime4000

Member

Anime4000

Member

spirit

Distinguished Member

Anime4000

Member

fweber

Proxmox Staff Member

Anime4000

Member

yamanipanuchi

New Member

We value your privacy