[SOLVED] Single datastore with namespace VS multiple datastore


New Member
Oct 13, 2023

I've been running a PBS for our backup for about one year, when there was no namespace support. I have three differents clusters to backup, and some vmID are shared among the clusters, so I could not backup them all in the same datastore (or I would have mixed backups for a single vmID, a mess). so I created 3 differents datastore, and it has been running relatively smooth. I would say 70% of my snapshots are CT, 25% are VM, and 5% of hosts.

However, I'm wondering what are the advantages of a single datastore (and the disadvantages).

For some stats, my datastore statistics are :
  • Datastore 1 : 1.9TB of data, 600 snapshots (it's mostly a huge machine with 1+TB of data) - a synced datastore from another PBS
  • Datastore 2 : 600GB of data, 1100 snapshots
  • Datastore 3 : 6.4TB of data, 4900 snapshots
Deduplication factor is ranging from 40 to 50 atm. My main concern with merging datastore is making operation too slow on them. At the moment, all operations on datastore 3 are quite slow :
  • I verify all new backups once a day, and it takes around 4-5 hours, and datastore 2 takes ~2hours (datastore 1 is verified on the other PBS)
  • Garbage collect takes around 2 hours
  • Accessing the GUI is often quite slow : when I click on datastore > content, it takes around 1 minute to load (on PBS interface), and on my PVE cluster, often when I go into the PBS backups, I get a connection timeout (and I have to reload it again).
I realise this last issue might be coming from the GUI itself (because using the CLI it takes 25s to run `proxmox-backup-client snapshot list`), but it's really convenient and I don't want to have to use the CLI to see / restore backups !

My question is : Are there benefits to merging those datastore beside reducing data size on disk (because of shared chunks) ? And is there any way to "predict" how much data size on disk I'll effectively gain ? Is there other drawbacks besides even slower operations on this datastore ?

As usual, It depends... I assume that your datastores are directories from the same storage, and given the load times you describe they are probably HDD. If using ZFS a special device is like night and day for tasks like content browsing or GC.

Having one datastore will allow you to:

- Reduce disk usage as chunks will be deduplicated for all your backups. This will be noticeable if VM's content of all your datastores is similar.
- Verification tasks can potentially be faster than your current ones, as each chunk is verified just once for each verify run. Each chunk may be shared among more snapshots, making verify of all backups faster.
- GC times will probably similar, as there will be a tradeoff between having more snapshots to check and less chunks due to increased deduplication. At least you will not have to deal with overlapping GC's from your datastores.
- If using ZFS, you will use less ARC memory for metadata, as there will be just one chunks directory tree instead of three. Dont know the real benefit of this is, though.

AFAIK, the only reliable way to predict the potential space savings is creating yet another datastore and sync backups from your current ones to the new one.
Thanks for the quick reply ! I'm indeed using HDD, I'll look into adding some SSD as a special device.

If I understand correctly, with special device (which I'll probably need anyway), merging my datastore will only result in reduced disk usage, and no potentially bad side-effect ?


The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!