PBS crash crashes also pvestatd on PVE nodes

flames

Renowned Member
Feb 8, 2018
147
27
68
CH
Hello,

in my testlab I have 4x PVE 6.2.x with Ceph 3 OSDs each and PBS on a separate physical machine 2x4tb hdd ZFS mirror, everything updated on daily basis from no-subscription repo.

PBS hangs sporadically once or twice a day (no ping, no GUI, no SSH, need to reset machine), but this is not the issue, beta software here, and i could not yet track why PBS crashes. Might be a hardware problem. Old crappy hardware it is, good for testing.

The actual issue is, that on the PVE cluster pvestatd crashes, when PBS not available (question marks appear everywhere and no storages are accessible via gui), until i reboot PBS and restart pvestatd on all PVE nodes.

Is this a know issue? Guess pvestatd should not crash, but mark PBS storage as offline, as it happens when a CIFS/NFS storage is not accessible. It persists for me since day one of PBS release.

Thanks in advance
 
Thanks for your work and ticket! I will try to trace back the hanging PBS and report back. I kind of mixed two different issues here, related, but different.
Crashing/hanging PBS is the one. The other and more important is, that pvestatd on the PVE nodes is crashing, when PBS is not available (also when i shutdown PBS for maintenance). But yes, the pb-client is also beta.
Edit: Currently i am reinstalling the PBS machine with PVE, and will install PBS in a VM + passthrough the ZFS disks to it, just to see, what happens.
 
Yes, I have the same issue with pvestatd, it happens because the proxmox-backup-client keeps waiting indefinitely for a response when pvestatd asks for the amount of free space available. Killing either the client or pvestatd makes the nodes green again (temporarily).

I haven't submitted a bug report about that yet though.
 
  • Like
Reactions: flames
I've had this issue as well 2-3 times since the Backup Server was launched. Although restarting PVESTATD makes things green again I had complications until I essentially restarted the server. However, I am running most of my backups onto and external USB which may not be the best practice and could be the cause or the result. Not sure yet.
 
Yes, I have the same issue with pvestatd, it happens because the proxmox-backup-client keeps waiting indefinitely for a response when pvestatd asks for the amount of free space available. Killing either the client or pvestatd makes the nodes green again (temporarily).

This is a more general issue with pvestatd, also triggered by dead/temporarily unavailable NFS, where the kernel places the whole process in D state (uninterruptible IO).
We planned a redesign of it's architecture since a bit, but currently really thinking about going with rust to avoid some limitations of Perl with the main tradeoff being an not small increase in the initial work to do.
Anyway, here I think we should really checkout the client behavior, which my colleague Stoiko already tries to investigate within the bug report. Fixing that would help also other usages of the client, not only pvestatd.
 
  • Like
Reactions: flames

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!