snapshot list very slow on very large PBS server

maximmonin

New Member
Dec 9, 2023
5
1
3
We have pbs server with 8500 vm snapshots. Every VM is about 4TB in size.
With deduplication we store it with 4Tb ssd special device disks + 700 TB 36 HDD disk array.
Everything works almost fine except snapshot list api method.
It takes less that 1 sec if we use cron recaching every 5 minutes.
But if backup list changes it takes about 80 sec to get snapshot list (device has no active backup/restore processes)
And takes up to 10 minutes, if disks are hard loaded.

As far as I understand architecture of PBS, it does not have any db tables to store such arrays. And api method just gets it data from file list method in root or namespace subdir.

To surpass this problem we use additional external redis cache to get snapshot list always in < 500ms.

Just a wish, to speed it up :)
 
We have pbs server with 8500 vm snapshots. Every VM is about 4TB in size.
With deduplication we store it with 4Tb ssd special device disks + 700 TB 36 HDD disk array.
Everything works almost fine except snapshot list api method.
It takes less that 1 sec if we use cron recaching every 5 minutes.
Hi,
the size of the disks will not influence the time for the listing. If you perform a periodic listing via cron you probably keep the files required to list the snapshots cached by the filesystem/kernel.

And takes up to 10 minutes, if disks are hard loaded.
What do you mean by this? Do you mean a disk from a snapshot is mapped?

As far as I understand architecture of PBS, it does not have any db tables to store such arrays. And api method just gets it data from file list method in root or namespace subdir.
No, the listing is generated by reading from the filesystem. You can speed up the listing by reorganizing the snapshots into namespaces and sub-namespaces, by this reducing the number of snapshots for each namespace.

Just a wish, to speed it up :)
Please do open an issue in our bugtracker https://bugzilla.proxmox.com/ so we can keep track and evaluate possible solutions.
 
Hi,
the size of the disks will not influence the time for the listing. If you perform a periodic listing via cron you probably keep the files required to list the snapshots cached by the filesystem/kernel.


What do you mean by this? Do you mean a disk from a snapshot is mapped?


No, the listing is generated by reading from the filesystem. You can speed up the listing by reorganizing the snapshots into namespaces and sub-namespaces, by this reducing the number of snapshots for each namespace.


Please do open an issue in our bugtracker https://bugzilla.proxmox.com/ so we can keep track and evaluate possible solutions.
I mean, that proxmox-backup-client snapshot list --ns somespace, or pbs web interface or http call to pbs api returns result in:
200-500ms in 95-99% cases.
80-200 sec in 1-5% cases.
And sometimes, ones in 30-60 days our pbs api http driver sends to our telegram channel message that timeout 600 sec is exceeded.

We use 10 namespaces but biggest one and most used have 5500 snapshots.

Disk hard loaded means, up to 5 vm backuping now and many snapshots mapped with restoration processes. So disks utilization 100 percent.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!