Increase PVE timeout when listing PBS storage contents

VictorSTS · Dec 20, 2024

TLDR;

Is it possible to increase the timeout in PVE when listing PBS backups? Seems to timeout at ~25 seconds and there is no way to get the list of backups to do a restore, neither from the storage view nor within the VM it self.
Is there a way to really tell ZFS to always keep metadata in ARC? (afaik no, but maybe someone has some trick for this).

Long version:

I have a PBS with 8x14tb HDD + special device. The PVE cluster has like 600VM and there's ~8000 snapshots in the namespace used by PVE. The special device got filled over 75% and due to zfs_special_class_metadata_reserve_pct default of "25%", after that point only metadata got stored in the special device. Now listing the snapshots takes between 40 to 90 seconds, depending on how loaded is PBS (GC, sync jobs, verify). Even listing the backups from PBS itself with /usr/bin/proxmox-backup-client snapshot list --repository 'user@pbs@localhost:8007:DATASTORE -ns NAMESPACE' take as long as doing it from PVE with pvesm list pbs_STORAGENAME.

This same issue would affect full HDD PBS without special device and overloaded PBS servers that may take long to respond to such requests.

Things that I'm already aware of:

- Have reduced zfs_special_class_metadata_reserve_pct to 10 to allow new small_blocks to be allocated in the special device.
- Small blocks already stored in HDD will remain there until the snapsnots using them are eventually purged, so even if we expand the special device listings will be slow for some time.
- Tried to force ZFS to keep metadata longer in ARC with zfs_arc_meta_balance, but no value over the default 500 made any difference.
- Tried to set ARC to cache metadata only (primarycache=metadata), it reduced the times by ~20% but still too slow (and would make other PBS operations way slower).
- Have no place to setup a L2ARC with secondarycache=metadata, but given that primarycache=metadata didn't help much, don't think it would help this time.

Thanks!

fabian · Dec 20, 2024

https://bugzilla.proxmox.com/show_bug.cgi?id=3045 / https://bugzilla.proxmox.com/show_bug.cgi?id=3752 (and related bugs)

this requires quite a bit of rework of how those endpoints and the client side is implemented.

VictorSTS · Dec 20, 2024

Hope devs can work on it asap, as it seems to affect quite some use cases and in mine has become a pain point (that could had been avoided if this very PBS had it's special device usage monitored, which wasn't the case

)

Search

Search

Increase PVE timeout when listing PBS storage contents

VictorSTS

Famous Member

fabian

Proxmox Staff Member

VictorSTS

Famous Member