PBS listing backups timeout

NomadCF

Active Member
Dec 20, 2017
28
1
43
43
All three of our PBS servers timeout with trying to list backups, regardless if we're trying to list all of them or just for a single CT/VM. This has made them essentially useless to us and we've had to revert back to using vzdumps.

  1. The PBS servers are all version 2.2-7 and PVE cluster is all version 7.2-11
  2. We have sure the patches from ( https://git.proxmox.com/?p=pve-storage.git;a=commitdiff;h=c560cb58a5b92ce436eea34554b5d091d2acacde ) where in place.
    1. I'm not sure why this timeout isn't list as a advanced setting in the web gui for this storage type
  3. Tried increasing the timeout to 500

We can sometimes get them to work if it select another PBS server and quickly click back a few times.
We have it setup for 1 PBS server to run all the backups for the cluster and the other two to sync from that one (and each other). This is due to the issue where only the last use PBS server can use the bitmap for a differential backup without it getting marked as dirty.

Our Prune job runs daily and have the following settings.
  1. Keep Last 5
  2. Keep Daily 7
  3. Keep Monthly 2
  4. Keep Hourly 72
  5. Keep Weekly 2
  6. Keep Yearly 3
The PBS server are all the same setup with
  1. SAS HDD
  2. 2x Intel(R) Xeon(R) Gold 5215
  3. 64G memory
  4. Mirrored OS SSD drives
  5. Raidz2 14x8TB HDD with a Mirrored special device (NVM).
  6. Our Deduplication Factor is 139.97
  7. We have
    1. CT 51 Groups, 4486 Snapshots
    2. VM 37 Groups, 3262 Snapshots
    3. Hosts 0/0
    4. Usage 61.82% (12.36 TB of 20.00 TB)

Does PBS keep a quick (sqlite db ??) listing of all the backups for quick reference or does it need to build the list from "scratch" each time ?
 
I am in the same boat: PBS-Datastore on rotating rust in Raid6 (not ZFS) with abysmal IOPS. My setup is far below the recommended hardware, so I do not complain. The only way for me to make it work was to reduce "keep" drastically. Any form of speed-up or an official way to increase the too short timeout would be very welcome.

Best regards
 
Well after digging through the source, PBS rebuilds it's lists of information each and every time from the filesystem. Meaning that it's a old schools flat file data base. So if you want to speed up (or make it use able again) listing your backups. Moving just the "metadata" (of sorts) directories "ct", "vm" and "host" directories to a SSD fix your issue. You can't use a symlink (which is aggravating) so you'll need to use a real mount each one. Again you don't need to move the .chunks directory to an SSD, but the three directories lists before.

I'm not sure what the idea was behind not using a simple DB for just the metadata. It would have solved this issue for everyone and made the system "seem" very responsive. I'm not saying move everything from these meteadata directories just the basic info needed for list the backups in both PBS & PVE web gui.
 
I'm not sure what the idea was behind not using a simple DB for just the metadata.
Because you always need to keep that DB consistent. And it would be a single point of failure.
I'd prefer configurable locations for the metadata files in addition to the .chunks location.
In most cases using special_small_blocks in ZFS should cover your needs, though.
Personally I made positive experiences using L2ARC for that matter, too.
with a Mirrored special device (NVM)
Please post your ZFS special_small_blocks value and record_size value.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!