PBS listing backups timeout

NomadCF · Nov 18, 2022

All three of our PBS servers timeout with trying to list backups, regardless if we're trying to list all of them or just for a single CT/VM. This has made them essentially useless to us and we've had to revert back to using vzdumps.

The PBS servers are all version 2.2-7 and PVE cluster is all version 7.2-11
We have sure the patches from ( https://git.proxmox.com/?p=pve-storage.git;a=commitdiff;h=c560cb58a5b92ce436eea34554b5d091d2acacde ) where in place.
1. I'm not sure why this timeout isn't list as a advanced setting in the web gui for this storage type
Tried increasing the timeout to 500

We can sometimes get them to work if it select another PBS server and quickly click back a few times.
We have it setup for 1 PBS server to run all the backups for the cluster and the other two to sync from that one (and each other). This is due to the issue where only the last use PBS server can use the bitmap for a differential backup without it getting marked as dirty.

Our Prune job runs daily and have the following settings.

Keep Last 5
Keep Daily 7
Keep Monthly 2
Keep Hourly 72
Keep Weekly 2
Keep Yearly 3

The PBS server are all the same setup with

SAS HDD
2x Intel(R) Xeon(R) Gold 5215
64G memory
Mirrored OS SSD drives
Raidz2 14x8TB HDD with a Mirrored special device (NVM).
Our Deduplication Factor is 139.97
We have
1. CT 51 Groups, 4486 Snapshots
2. VM 37 Groups, 3262 Snapshots
3. Hosts 0/0
4. Usage 61.82% (12.36 TB of 20.00 TB)

Does PBS keep a quick (sqlite db ??) listing of all the backups for quick reference or does it need to build the list from "scratch" each time ?

UdoB · Nov 18, 2022

I am in the same boat: PBS-Datastore on rotating rust in Raid6 (not ZFS) with abysmal IOPS. My setup is far below the recommended hardware, so I do not complain. The only way for me to make it work was to reduce "keep" drastically. Any form of speed-up or an official way to increase the too short timeout would be very welcome.

Best regards

NomadCF · Nov 18, 2022

Well after digging through the source, PBS rebuilds it's lists of information each and every time from the filesystem. Meaning that it's a old schools flat file data base. So if you want to speed up (or make it use able again) listing your backups. Moving just the "metadata" (of sorts) directories "ct", "vm" and "host" directories to a SSD fix your issue. You can't use a symlink (which is aggravating) so you'll need to use a real mount each one. Again you don't need to move the .chunks directory to an SSD, but the three directories lists before.

I'm not sure what the idea was behind not using a simple DB for just the metadata. It would have solved this issue for everyone and made the system "seem" very responsive. I'm not saying move everything from these meteadata directories just the basic info needed for list the backups in both PBS & PVE web gui.

Felix. · Nov 18, 2022

NomadCF said:
I'm not sure what the idea was behind not using a simple DB for just the metadata.

Because you always need to keep that DB consistent. And it would be a single point of failure.
I'd prefer configurable locations for the metadata files in addition to the .chunks location.
In most cases using special_small_blocks in ZFS should cover your needs, though.
Personally I made positive experiences using L2ARC for that matter, too.

NomadCF said:
with a Mirrored special device (NVM)

Please post your ZFS special_small_blocks value and record_size value.

Search

Search

PBS listing backups timeout

NomadCF

Active Member

UdoB

Distinguished Member

NomadCF

Active Member

Felix.

Renowned Member

We value your privacy