pvestatd segfaults

unterkomplex · Sep 1, 2025

Hi,

I noticed that pvestatd was not running on one of my nodes. It turns out that the service segfaulted

Aug 31 16:39:55 pve1 kernel: pvestatd[1862]: segfault at 100000000000 ip 000063217ac22321 sp 00007ffc2cfa5140 error 4 in perl[95321,63217abd1000+1ae000] likely on CPU 3 (core 3, socket 1)
Aug 31 16:39:55 pve1 kernel: Code: 00 00 00 66 0f 1f 44 00 00 48 8d 4a 01 48 83 c0 08 49 89 0c 24 48 8b 75 00 48 3b 56 18 73 52 48 89 ca 48 8b 18 48 85 db 74 df <48> 8b 13 48 89 10 48 8b 45 00 48 83 68 10 01 83 b>
Aug 31 16:39:55 pve1 systemd[1]: pvestatd.service: Main process exited, code=killed, status=11/SEGV
Aug 31 16:39:55 pve1 systemd[1]: pvestatd.service: Failed with result 'signal'.
Aug 31 16:39:55 pve1 systemd[1]: pvestatd.service: Consumed 7h 27min 59.427s CPU time, 160.6M memory peak.

I was able to manually restart the service and it seems to work fine now

Sep 01 10:20:25 pve1 systemd[1]: Starting pvestatd.service - PVE Status Daemon...
Sep 01 10:20:27 pve1 pvestatd[2099919]: starting server
Sep 01 10:20:27 pve1 systemd[1]: Started pvestatd.service - PVE Status Daemon.

I checked for any other segfaults

root@pve1:~# journalctl | grep segfault
Nov 03 23:02:28 pve1 kernel: dnsmasq[228401]: segfault at 5ea02dc81e1e ip 00007445fd6c42d5 sp 00007ffc327293a8 error 4 in libdbus-1.so.3.32.4[7445fd6a1000+30000] likely on CPU 0 (core 0, socket 0)
Nov 28 08:55:59 pve1 kernel: dnsmasq[1685]: segfault at 616309463d64 ip 00007d02c4f892d5 sp 00007ffe04781958 error 4 in libdbus-1.so.3.32.4[7d02c4f66000+30000] likely on CPU 3 (core 3, socket 0)
Dec 08 22:11:05 pve1 kernel: dnsmasq[1265]: segfault at 5941068861ac ip 00007c80e1a0a2d5 sp 00007fff0b538e28 error 4 in libdbus-1.so.3.32.4[7c80e19e7000+30000] likely on CPU 1 (core 1, socket 0)
Aug 10 04:24:01 pve1 kernel: task UPIDve1:[2514843]: segfault at 2989dc5a8 ip 00007529bff58087 sp 00007ffc9b12f8c0 error 4 in libc.so.6[7529bfee9000+155000] likely on CPU 2 (core 2, socket 0)
Aug 21 14:29:43 pve1 kernel: python3[2469825]: segfault at ffffffffff8 ip 00007d1eaae60efa sp 00007ffcc1a05fa0 error 4 in libc.so.6[7d1eaadee000+155000] likely on CPU 0 (core 0, socket 0)
Aug 31 09:12:34 pve1 kernel: python3[1528251]: segfault at 100000000008 ip 00007d900a4b9653 sp 00007ffd7a3b4c90 error 4 in libcrypto.so.3[26d653,7d900a343000+381000] likely on CPU 0 (core 0, socket 1)
Aug 31 16:39:55 pve1 kernel: pvestatd[1862]: segfault at 100000000000 ip 000063217ac22321 sp 00007ffc2cfa5140 error 4 in perl[95321,63217abd1000+1ae000] likely on CPU 3 (core 3, socket 1)

and it looks like there was one segfault a couple of hours earlier (with no reboot in between)

Aug 31 09:12:34 pve1 kernel: python3[1528251]: segfault at 100000000008 ip 00007d900a4b9653 sp 00007ffd7a3b4c90 error 4 in libcrypto.so.3[26d653,7d900a343000+381000] likely on CPU 0 (core 0, socket 1)
Aug 31 09:12:35 pve1 kernel: Code: 89 ee 48 89 f5 49 c1 e6 03 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 49 8b 07 4a 8b 1c 30 48 85 db 74 19 0f 1f 40 00 48 89 d8 <48> 8b 5b 08 48 89 ee 48 8b 38 41 ff d4 48 85 db 75 eb 41 83 ed 01

The node was upgraded from Proxmox 8.4 to 9 on 27th August
A third segfault happened on Promox 8.4

Aug 21 14:29:43 pve1 kernel: python3[2469825]: segfault at ffffffffff8 ip 00007d1eaae60efa sp 00007ffcc1a05fa0 error 4 in libc.so.6[7d1eaadee000+155000] likely on CPU 0 (core 0, socket 0)
Aug 21 14:29:43 pve1 kernel: Code: ac 2c 10 00 e8 f7 62 fe ff 0f 1f 80 00 00 00 00 48 85 ff 0f 84 bf 00 00 00 55 48 8d 77 f0 53 48 83 ec 18 48 8b 1d e6 8e 13 00 <48> 8b 47 f8 64 8b 2b a8 02 75 5b 48 8b 15 6c 8e 13 00 64 48 83 3a

The CPU is quite old : AMD Embedded G-Series GX-420GI Radeon R7E

I am not sure if this is more likely due to a faulty hardware or a bug. Happy to provide more details if that helps to investigate

dakralex · Sep 1, 2025

Hi!

As there are multiple programs segfaulting, I'd check if there are any problems with memory (memtest), filesystem corruptions or package corruptions (e.g. smart tests, checking packages with debsums -c, etc.).

unterkomplex · Sep 2, 2025

I tried

debsums
24h memtest
smartctl long test
zpool scrub

and all checks returned no errors
I guess it is hard to investigate further, but leaving this here in case others encounter similar issues

dakralex · Sep 3, 2025

It's also worth a try to check the dmesg/syslog around the time where the segfaults happen and/or if there are any errors during boot. Are there any BIOS settings that were changed? What about resetting the BIOS settings to default?

unterkomplex · Sep 5, 2025

Hi,
Thanks for your support

BIOS has been on latest version and settings have not changed since last year
I have reset the settings now and then configured them again just to be sure
I checked journalctl k -b0 and didn't seen anything obvious (the entry just before the segfault is from boot two days earlier)

I will see if it occurs again, but if no one else reports similar issues then I agree it looks more like a hardware issue or something related to my specific config

alexdelprete · Sep 14, 2025

unterkomplex said:
I will see if it occurs again, but if no one else reports similar issues then I agree it looks more like a hardware issue or something related to my specific config

I have the same issue on my MS-01 system. It has worked perfectly fine until I upgraded to proxmox 9. only after the upgrade pvestatd started to suddenly stop with this kind of error in the log:

[dom set 14 20:22:37 2025] pvestatd[2053]: segfault at 2b4c ip 000056eba86c772f sp 00007ffe1a040720 error 4 in perl[19872f,56eba8573000+1ae000] likely on CPU 5 (core 8, socket 0)

Even though I didn't think it was a hw issue, I ran hw tests on memory and cpu, and no issues were reported.

dakralex · Sep 15, 2025

alexdelprete said:
Even though I didn't think it was a hw issue, I ran hw tests on memory and cpu, and no issues were reported.

Which CPU tests have you made? A good test suite that usually shows some signs of hardware trouble is stress-ng as these kinds of errors are usually caused when there's quite a load on the CPU. Otherwise random segfaults of widespread executables (such as perl, dnsmaq, python, ...) on a stable kernel occurring right next after each other is most of the time a sign of hardware issues.

alexdelprete · Sep 16, 2025

dakralex said:
A good test suite that usually shows some signs of hardware trouble is stress-ng as these kinds of errors are usually caused when there's quite a load on the CPU.

I will try it, thanks.

dakralex said:
Otherwise random segfaults of widespread executables (such as perl, dnsmaq, python, ...) on a stable kernel occurring right next after each other is most of the time a sign of hardware issues.

I would agree with you if it wasn't for the fact that I ran proxmox 8.x for 1 year on the same hardware with no issues. How would you explain that?

dakralex · Sep 17, 2025

alexdelprete said:
I would agree with you if it wasn't for the fact that I ran proxmox 8.x for 1 year on the same hardware with no issues. How would you explain that?

Hardware has wear too as any other component and sometimes that is even exacerbated by implementation faults, e.g. the 13th and 14th Intel 700 and 900 series had problems with overvoltage, which could render cores permanently degraded or even fail them entirely. Another cause could be that a cable is loose or the hardware configuration was changed (e.g. through BIOS options).

All in all, it's more likely that there's an issue with the hardware when multiple unrelated binaries segfault, which are in use by millions of people. But it's also only a first check to eliminate sources of errors and then check for other things

.

alarsson · Oct 13, 2025

FWIW, I am also now seeing the same issue on one of my MS-01s since upgrading from 8.4 to 9.0.10 with kernel 6.41. No other daemons are segfaulting. I suspected bad RAM but memtest86 isn't finding anything.

curruscanis · Saturday at 00:44

I have an MS-01 that is having similar issues, has there been any further advances?

curruscanis · Saturday at 00:45

curruscanis said:
I have an MS-01 that is having similar issues, has there been any further advances?

But I also have 9 other MS-01 systems that do not behave in this manner.

Search

Search

pvestatd segfaults

unterkomplex

New Member

dakralex

Proxmox Staff Member

unterkomplex

New Member

dakralex

Proxmox Staff Member

unterkomplex

New Member

alexdelprete

Member

dakralex

Proxmox Staff Member

alexdelprete

Member

dakralex

Proxmox Staff Member

alarsson

New Member

curruscanis

New Member

curruscanis

New Member

We value your privacy